Skip to content

Preserve actionable initialize errors after CLI process exit#702

Open
shuofengzhang wants to merge 2 commits intoanthropics:mainfrom
shuofengzhang:fix-initialize-error-during-execution-message
Open

Preserve actionable initialize errors after CLI process exit#702
shuofengzhang wants to merge 2 commits intoanthropics:mainfrom
shuofengzhang:fix-initialize-error-during-execution-message

Conversation

@shuofengzhang
Copy link
Copy Markdown

What changed

  • Capture result messages with subtype="error_during_execution" in Query._read_messages and store the CLI-provided error text.
  • When the reader later exits with a ProcessError, propagate the captured CLI error text to pending control requests (such as initialize) instead of the generic process-exit placeholder.
  • Add a regression test (TestInitializeErrorPropagation) that reproduces the failure mode and verifies the propagated exception contains the real CLI error text.

Why

  • In the failing initialize path, the CLI can emit an error_during_execution result on the message stream and then exit non-zero.
  • Previously, the non-zero exit became a generic ProcessError ("Check stderr output for details") that overwrote pending control requests, so callers waiting on initialize() lost the actionable root-cause message.

Insight / Why this matters

The core issue is a cross-channel ordering bug:

  1. The real failure reason is emitted as a result message (error_during_execution) on the message stream.
  2. initialize() is waiting on the control-response channel, not message iteration.
  3. Process exit then raises ProcessError, and that generic error replaced the pending control result.

This is easy to miss because both error signals exist, but only one is observed by initialize callers. Preserving the CLI-emitted execution error keeps diagnostics aligned with actual root cause.

Practical gain / Why this matters

  • Callers now receive actionable initialize failures (for example, invalid/missing session IDs) instead of a generic stderr hint.
  • Improves reliability of error handling in SDK integrations and reduces debugging/triage time when startup fails.

Testing

  • pytest -q tests/test_query.py -k "initialize_uses_error_during_execution_result_text"
  • pytest -q

Risk analysis

  • Scope is intentionally narrow: only affects how pending control waiters are failed when a ProcessError follows a known error_during_execution result.
  • No protocol or transport shape changes.
  • Regression test covers the exact failure ordering to protect against future regressions.

@shuofengzhang shuofengzhang force-pushed the fix-initialize-error-during-execution-message branch from 16e3002 to ec7291b Compare March 28, 2026 23:42
@km-anthropic
Copy link
Copy Markdown
Collaborator

@claude review

Comment on lines +261 to +262
if isinstance(e, ProcessError) and self._last_execution_error:
pending_error = Exception(self._last_execution_error)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 ProcessError is downgraded to a bare Exception on line 262: pending_error = Exception(self._last_execution_error). This means callers using the documented except ProcessError as e: pattern (README line 263) will not catch this error, and the structured exit_code/stderr attributes are lost. Should be ProcessError(self._last_execution_error, exit_code=e.exit_code, stderr=e.stderr).

Extended reasoning...

What the bug is

On line 262 of query.py, when a ProcessError is caught and _last_execution_error is set, the code creates a new bare Exception to replace it:

if isinstance(e, ProcessError) and self._last_execution_error:
    pending_error = Exception(self._last_execution_error)

This discards the ProcessError type and its structured attributes (exit_code, stderr), replacing them with a plain Exception carrying only the error message string.

How it propagates

The pending_error is stored in pending_control_results[request_id] (line 267). When _send_control_request() retrieves this result, it checks isinstance(result, Exception) and raises it (lines 434-435). This propagates through initialize() and ultimately to the caller of query(). The caller receives a bare Exception instead of the expected ProcessError.

Why existing code doesn't prevent it

The test added by this PR (test_initialize_uses_error_during_execution_result_text) catches with except Exception as e:, which catches all exception types indiscriminately. If the test used except ProcessError as e: — the pattern documented in the README — it would fail, exposing the type downgrade.

Step-by-step proof

  1. CLI emits a result message with subtype="error_during_execution" containing "No conversation found with session ID ab2c985b"
  2. _read_messages() stores this in self._last_execution_error
  3. CLI then exits non-zero, causing transport.read_messages() to raise ProcessError("Command failed with exit code 1", exit_code=1, stderr="Check stderr output for details")
  4. The except Exception as e: block catches this ProcessError
  5. Since isinstance(e, ProcessError) is True and _last_execution_error is set, line 262 creates pending_error = Exception("No conversation found with session ID ab2c985b") — a bare Exception
  6. This bare Exception is stored in pending_control_results and later raised by _send_control_request()
  7. A caller doing except ProcessError as e: (as documented in README line 263) will NOT catch this — it falls through as an unhandled Exception
  8. Even if caught with a broader handler, e.exit_code and e.stderr are unavailable since plain Exception has no such attributes

Impact

This breaks the public API contract for error handling. ProcessError is part of the SDK's public API (exported in __init__.py, documented in the README). Any SDK consumer following the recommended error handling pattern will experience unhandled exceptions in the specific scenario this PR aims to improve.

Fix

Replace line 262 with:

pending_error = ProcessError(self._last_execution_error, exit_code=e.exit_code, stderr=e.stderr)

This preserves the actionable error message (the PR's goal) while maintaining the correct exception type and structured attributes. The test should also be updated to catch ProcessError specifically to prevent future regressions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants