Skip to content

refactor(consensus): Acknowledge Sequencer Inserts#2513

Open
refcell wants to merge 3 commits intomainfrom
rf/refactor/acknowledge-sequencer-inserts
Open

refactor(consensus): Acknowledge Sequencer Inserts#2513
refcell wants to merge 3 commits intomainfrom
rf/refactor/acknowledge-sequencer-inserts

Conversation

@refcell
Copy link
Copy Markdown
Contributor

@refcell refcell commented May 4, 2026

Summary

This changes local sequencer payload insertion from enqueue acknowledgement to engine acknowledgement by returning the inserted unsafe head after engine processing and unsafe-head watch advancement. It removes the sequencer parent handoff workaround and deletes the unused external SealRequest actor path. The remaining external unsafe insert path stays fire-and-forget for network and delegated callers.

@refcell refcell added the consensus Area: consensus label May 4, 2026
@refcell refcell self-assigned this May 4, 2026
@cb-heimdall
Copy link
Copy Markdown
Collaborator

cb-heimdall commented May 4, 2026

🟡 Heimdall Review Status

Requirement Status More Info
Reviews 🟡 0/1
Denominator calculation
Show calculation
1 if user is bot 0
1 if user is external 0
2 if repo is sensitive 0
From .codeflow.yml 1
Additional review requirements
Show calculation
Max 0
0
From CODEOWNERS 0
Global minimum 0
Max 1
1
1 if commit is unverified 0
Sum 1

@vercel
Copy link
Copy Markdown

vercel Bot commented May 4, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
base Ignored Ignored Preview May 5, 2026 11:15pm

Request Review

Comment on lines +233 to 241
async fn execute(&self, state: &mut EngineState) -> Result<(), InsertTaskError> {
let result = self.insert_payload(state).await;
if self.result_tx.is_some() {
self.send_channel_result(result).await;
Ok(())
} else {
result.map(|_| ())
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When result_tx is Some and insert_payload fails, this method sends the error through the channel and returns Ok(()). This means the outer EngineTask::execute retry loop (in task.rs:248-283) sees Ok(()) from execute_inner and does not retry the task — the Temporary retry logic is completely bypassed for sequencer inserts.

This is an intentional design choice (the sequencer handles the error via the channel response), but it has a subtle consequence: errors that would normally be retried by the engine (e.g. InsertFailed, UnexpectedPayloadStatus, ForkchoiceUpdateDidNotAdvance — all classified as Temporary) are instead surfaced as a one-shot failure to the sequencer. The sequencer's SealStepError::Insert path will retry on the next seal step tick, but this loses the tight yield_now retry loop the engine normally provides for transient RPC errors.

Worth confirming this is the intended trade-off. If transient engine RPC errors should still be retried at the engine level before responding to the sequencer, consider only returning Ok(()) on the success path and propagating errors normally (letting the engine retry), then sending the success result through the channel.

Comment on lines +216 to +232
let mut unsafe_head_rx = self.unsafe_head_rx.clone();
loop {
let unsafe_head = *unsafe_head_rx.borrow_and_update();
if unsafe_head == inserted_head {
trace!(
target: "sequencer",
block_number = inserted_head.block_info.number,
block_hash = %inserted_head.block_info.hash,
"Insert unsafe payload acknowledged"
);
return Ok(inserted_head);
}

unsafe_head_rx.changed().await.map_err(|_| {
EngineClientError::ResponseError("unsafe head channel closed.".to_string())
})?;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop waits for unsafe_head_rx to match exactly inserted_head. If a reorg, reset, or concurrent external unsafe insert causes the watch channel to advance past inserted_head (to a different block at the same or higher number), this loop will hang indefinitely — it can never match and the changed() call will keep succeeding as long as the sender is alive.

Consider adding a timeout, or comparing block numbers (e.g. unsafe_head.block_info.number >= inserted_head.block_info.number) rather than requiring exact equality, with appropriate handling for the case where the head advanced to a different block than expected.

Comment on lines +86 to +98
pub fn unsafe_payload_with_result(
client: Arc<EngineClient_>,
rollup_config: Arc<RollupConfig>,
envelope: BaseExecutionPayloadEnvelope,
result_tx: Option<mpsc::Sender<InsertTaskResult>>,
) -> Self {
Self {
client,
rollup_config,
envelope,
payload_safety: InsertPayloadSafety::Unsafe,
result_tx,
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When result_tx is None, this constructor is functionally identical to unsafe_payload. Consider whether this constructor should require a mpsc::Sender (non-optional) since its sole purpose is the "with result" variant, and use unsafe_payload when no channel is needed. This would make the API self-documenting and avoid the Option ambiguity at the call site.

At the enqueue_unsafe_payload_insert call site in engine_request_processor.rs, the external path already passes None — it could just call InsertTask::unsafe_payload directly instead.

@refcell
Copy link
Copy Markdown
Contributor Author

refcell commented May 4, 2026

Note for the next cleanup step: the acknowledged insert result is intentionally using an mpsc sender in this PR, even though the result is logically one-shot. The current task queue still requires InsertTask / EngineTask to be cloneable, and tokio oneshot senders are not Clone. Once the engine path is flattened and direct methods replace the task wrappers, we should convert this and the other request/response channels to oneshot where each request has exactly one response.

@refcell refcell marked this pull request as ready for review May 4, 2026 20:44
.await?;

if self.result_tx.is_some() && state.sync_state.unsafe_head() != new_block_ref {
return Err(InsertTaskError::ForkchoiceUpdateDidNotAdvance);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This guard only fires when result_tx.is_some(), meaning non-sequencer (external/derivation) inserts silently succeed even if the FCU didn't actually advance the unsafe head. If the FCU returning a mismatched head is a real concern (e.g., the EL rejected the forkchoice), it seems like it should be checked unconditionally regardless of the caller. If it's only relevant for the sequencer's watch-channel flow, the check might belong in send_channel_result or in insert_unsafe_payload on the service side rather than here, so the engine-level task doesn't have caller-specific branching.

@refcell refcell requested a review from danyalprout May 5, 2026 13:02
refcell and others added 2 commits May 5, 2026 09:06
Make local sequencer payload insertion return the inserted head after engine processing and unsafe-head watch advancement. Remove the sequencer parent handoff workaround and unused external SealRequest actor route.

Co-authored-by: Codex <noreply@openai.com>
Update the action harness SequencerEngineClient implementation for acknowledged insert results and restore the insert task constructor as const for clippy.

Co-authored-by: Codex <noreply@openai.com>
@refcell refcell force-pushed the rf/refactor/acknowledge-sequencer-inserts branch from 65ea1f1 to 0c0bc7b Compare May 5, 2026 13:09
Use the acknowledged inserted head as the next sequencer build parent instead of waiting for the unsafe-head watch channel to publish the same value.

Co-authored-by: Codex <noreply@openai.com>
Comment on lines +233 to +234
async fn execute(&self, state: &mut EngineState) -> Result<(), InsertTaskError> {
let result = self.insert_payload(state).await;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When insert_payload fails and result_tx.is_some(), the error is sent to the sequencer via the channel and execute returns Ok(()). Since mpsc::Sender::send takes T by value, the error is moved into the channel — it's consumed. If the channel send itself fails (receiver dropped), the error is silently lost: send_channel_result logs a warning but has no way to propagate the original InsertTaskError back to the engine's retry loop.

This means a sequencer insert can fail with a Temporary error (e.g., InsertFailed, transient RPC timeout) AND the channel can be closed (sequencer task cancelled), and the engine treats it as success — no retry, no error propagation, no record of the failure beyond the warn! log.

Comment on lines +334 to +336
next_payload_to_seal = self.builder.build_on(inserted_head).await?;
if next_payload_to_seal.is_none() {
build_ticker.reset_immediately();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When build_on(inserted_head) returns None (parent not yet indexed), reset_immediately() fires and the next tick falls through to the else branch at line 398 which calls self.builder.build(). build() reads the unsafe head from the watch channel. Since the engine has acknowledged the insert, the watch channel should reflect the correct head — but there's a race window between the engine updating its internal sync state and the watch channel sender publishing the new value. If the watch hasn't been updated yet, build() will use the previous (grandparent) head and create a block on the wrong parent.

This was the exact scenario the removed next_build_parent stash protected against. Worth confirming that the engine's watch channel publish is guaranteed to happen before insert_unsafe_payload responds on the result channel.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

Review Summary

This PR replaces the fire-and-forget sequencer insert with an acknowledged insert (engine responds via an mpsc channel after the InsertTask completes), removes the next_build_parent workaround and the unused external SealRequest actor path. The approach is sound and simplifies the sequencer flow.

Findings

Error loss on channel failure (task.rs:233-234): When result_tx is Some and insert_payload fails, the error is moved into the channel via send_channel_result. If the channel send also fails (receiver dropped), the original error is silently lost — execute returns Ok(()), and the engine's retry loop treats it as success. Consider falling back to propagating the error to the engine when the channel send fails.

Race window on build_on fallback (actor.rs:334-336): When build_on(inserted_head) returns None and the ticker retries, it falls back to self.builder.build() which reads the unsafe head from the watch channel. The removed next_build_parent stash protected against a race where the watch channel hadn't yet published the new head. Worth verifying the engine's watch publish happens-before the result channel send.

Stale README reference: crates/consensus/service/README.md:66 still references SealRequest, which has been removed.

Missing unit test coverage: The unsafe_payload_with_result constructor and the execute method's channel-based acknowledgement flow (the core behavior change) have no unit tests at the engine crate level. The service-level test in engine_client.rs covers the happy path, but the error/fallback paths through execute when result_tx.is_some() are untested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

consensus Area: consensus

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants