You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PR #140 fixes #132 by adding explicit is_break_signal checks inside execute_join_node and execute_race_node. That's the right point fix, but the underlying pattern is fragile and will keep producing the same class of bug.
Today, df.break() is encoded as an in-band JSON sentinel ({"__break__": true, "value": ...}) carried inside the same Result<String, String> that every node returns. Every compound node (~>, if, join, race, and any future construct like try/catch, with_timeout, compensate, …) must remember to:
Call is_break_signal(&child_result) after awaiting each child.
Early-return before doing its own post-processing (storing result_name, merging branch results, etc.).
The compiler offers no help. #132 happened because exactly this step was forgotten in two places. The next compound node added will likely reintroduce it.
There's also a minor collision risk: a user query returning JSON like {"__break__": true} would be misinterpreted as a break signal.
Proposal
Encode break as a typed error variant and let ? handle propagation:
enumNodeError{Break(String),// unwinds until the nearest loop catches itFailure(String),// real failure, propagates to orchestration result}typeNodeResult = Result<String,NodeError>;
Then:
execute_break_node returns Err(NodeError::Break(value)) instead of the JSON sentinel.
execute_then_node, execute_join_node, execute_race_node, execute_if_node become child().await? — propagation is automatic, no explicit is_break_signal checks.
execute_loop_node is the only place that catches it:
SubtreeEnvelope gains a typed field so a break inside execute_subtree is re-raised as Err(Break(..)) in the parent rather than smuggled through the JSON result string.
Benefits
The bug class disappears. Forgetting to propagate break becomes a type error — ? handles it.
No sentinel collision — user payloads cannot impersonate control flow.
extract_break_value / BREAK_SENTINEL JSON-munging goes away. The value travels as a plain String field on the enum (no double serialization).
JOIN multi-break semantics become explicit. Today the first-in-iteration-order break silently wins; with a typed variant we can pick semantics deliberately.
"df.break() outside a loop" becomes easy to detect. It surfaces as a single uncaught NodeError::Break at the orchestration boundary, which we can convert to a clean error instead of returning a JSON sentinel as the function's final result (the current behavior).
Costs
Mechanical refactor across every node handler signature in src/orchestrations/execute_function_graph.rs.
Small change to SubtreeEnvelope and the JOIN/RACE sub-orchestration boundary to carry the typed control-flow signal back to the parent.
Need to update unit/E2E tests that may rely on the sentinel format (currently none surface it publicly, but worth verifying).
is_break_signal, extract_break_value, and BREAK_SENTINEL are removed.
New test: df.break() used at the top level of df.start() (no enclosing loop) produces a clear, actionable error rather than completing with a sentinel as the result.
New test: df.break() inside a nested compound (e.g., df.if(..., df.break('x'), ...) inside a JOIN inside a loop) correctly exits the outer loop.
Background
PR #140 fixes #132 by adding explicit
is_break_signalchecks insideexecute_join_nodeandexecute_race_node. That's the right point fix, but the underlying pattern is fragile and will keep producing the same class of bug.Today,
df.break()is encoded as an in-band JSON sentinel ({"__break__": true, "value": ...}) carried inside the sameResult<String, String>that every node returns. Every compound node (~>,if,join,race, and any future construct liketry/catch,with_timeout,compensate, …) must remember to:is_break_signal(&child_result)after awaiting each child.result_name, merging branch results, etc.).The compiler offers no help. #132 happened because exactly this step was forgotten in two places. The next compound node added will likely reintroduce it.
There's also a minor collision risk: a user query returning JSON like
{"__break__": true}would be misinterpreted as a break signal.Proposal
Encode break as a typed error variant and let
?handle propagation:Then:
execute_break_nodereturnsErr(NodeError::Break(value))instead of the JSON sentinel.execute_then_node,execute_join_node,execute_race_node,execute_if_nodebecomechild().await?— propagation is automatic, no explicitis_break_signalchecks.execute_loop_nodeis the only place that catches it:SubtreeEnvelopegains a typed field so a break insideexecute_subtreeis re-raised asErr(Break(..))in the parent rather than smuggled through the JSON result string.Benefits
?handles it.extract_break_value/BREAK_SENTINELJSON-munging goes away. The value travels as a plainStringfield on the enum (no double serialization).df.break()outside a loop" becomes easy to detect. It surfaces as a single uncaughtNodeError::Breakat the orchestration boundary, which we can convert to a clean error instead of returning a JSON sentinel as the function's final result (the current behavior).Costs
src/orchestrations/execute_function_graph.rs.SubtreeEnvelopeand the JOIN/RACE sub-orchestration boundary to carry the typed control-flow signal back to the parent.Acceptance
tests/e2e/sql/22_break_in_join_race.sqlfrom Fix df.break() silently ignored inside JOIN/RACE branches #140.is_break_signal,extract_break_value, andBREAK_SENTINELare removed.df.break()used at the top level ofdf.start()(no enclosing loop) produces a clear, actionable error rather than completing with a sentinel as the result.df.break()inside a nested compound (e.g.,df.if(..., df.break('x'), ...)inside a JOIN inside a loop) correctly exits the outer loop.Related