fix: record errors on spans for anthropic, groq, and mistralai by Dheeraj-Bhaskaruni · Pull Request #3970 · traceloop/openllmetry

Dheeraj-Bhaskaruni · 2026-04-10T17:05:24Z

Summary

Fixes #412

When API calls to foundation model providers raise exceptions, the OpenTelemetry spans were not being properly marked as failed. This PR adds error recording to three instrumentation packages:

anthropic: Sync and async wrappers caught exceptions for metrics but never called span.record_exception(), span.set_status(ERROR), or span.end()
groq: Same issue — exceptions were caught for duration metrics but span was left in UNSET state
mistralai: No try/except at all around wrapped() calls, so exceptions propagated with spans never ended or marked

Changes

All three packages now follow the same pattern used by opentelemetry-instrumentation-openai:

except Exception as e:
    span.record_exception(e)
    span.set_status(Status(StatusCode.ERROR, str(e)))
    span.end()
    raise

Files changed

packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py
packages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/__init__.py
packages/opentelemetry-instrumentation-mistralai/opentelemetry/instrumentation/mistralai/__init__.py

Test plan

ruff check passes on all changed files
Error spans appear correctly in trace exporters when API calls fail
Exceptions still propagate to callers after being recorded
Streaming and non-streaming paths both handle errors correctly

Summary by CodeRabbit

Bug Fixes
- Improved error tracking in observability instrumentation for Anthropic, Groq, and Mistral AI integrations. Exceptions are now automatically recorded with detailed status information and properly closed in monitoring systems, enabling better error visibility, more accurate failure reporting, and enhanced debugging capabilities.

…mentations When API calls to foundation model providers raise exceptions, the spans were not being marked as failed. This adds span.record_exception() and span.set_status(ERROR) to the error paths in both sync and async wrappers for these three packages. - anthropic: added record_exception, set_status(ERROR), and span.end() - groq: added record_exception, set_status(ERROR), and span.end() - mistralai: wrapped API calls in try/except with full error recording Fixes traceloop#412

coderabbitai · 2026-04-10T17:05:43Z

📝 Walkthrough

Walkthrough

The changes add exception handling to synchronous and asynchronous wrappers in three OpenTelemetry instrumentation packages (anthropic, groq, mistralai). When exceptions occur during API calls, spans now explicitly record the exception, set status to ERROR with the exception message, and end the span before re-raising.

Changes

Cohort / File(s)	Summary
Exception Span Recording `packages/opentelemetry-instrumentation-anthropic/.../anthropic/__init__.py`, `packages/opentelemetry-instrumentation-groq/.../groq/__init__.py`, `packages/opentelemetry-instrumentation-mistralai/.../mistralai/__init__.py`	Added exception handling in sync and async wrappers to record exceptions in active spans, set span status to ERROR with exception message via `span.record_exception(e)` and `span.set_status(...)`, and explicitly end spans with `span.end()` before re-raising exceptions.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 When errors crash through async code,
Our spans now catch them on the road,
Recording each exception's tale,
With ERROR status—they won't fail! ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and accurately summarizes the main change: recording errors on spans for three specific foundation model instrumentation packages.
Linked Issues check	✅ Passed	All changes directly address issue `#412` by adding error recording, status setting, and span termination when exceptions occur in anthropic, groq, and mistralai instrumentation packages.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to fixing error logging on spans for the three specified packages, with no unrelated modifications introduced.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

CLAassistant · 2026-04-10T17:05:44Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py (1)
567-581: ⚠️ Potential issue | 🔴 Critical

AnthropicStream and context manager wrappers do not properly handle span closure on streaming errors.

The exception handler in AnthropicStream.__next__() (lines 226-230) increments the exception counter but does not call record_exception(), set_status(StatusCode.ERROR), or span.end(). When iteration fails mid-stream, the span remains incomplete and unmarked as errored.

Additionally, WrappedMessageStreamManager.__exit__() and WrappedAsyncMessageStreamManager.__aexit__() delegate entirely to the underlying stream manager without checking or handling the span state. If an exception propagates during the with/async with block, the span is never marked as errored or properly closed.

While AnthropicAsyncStream.__anext__() (lines 374-384) correctly handles streaming exceptions by setting error status and ending the span, the sync path and both context manager wrappers leave spans in an incomplete state.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py`
around lines 567 - 581, The sync streaming path and context-manager wrappers
must mirror the async error-handling: in AnthropicStream.__next__ ensure that
when an exception occurs you call span.record_exception(e),
span.set_status(Status(StatusCode.ERROR, str(e))) and span.end() (and record
duration_histogram/exception_counter like the async path) before re-raising;
similarly, in WrappedMessageStreamManager.__exit__ and
WrappedAsyncMessageStreamManager.__aexit__ detect if an exception is propagating
(non-None exc_type/exception) and, if a span exists on the wrapper, call
span.record_exception(exc), span.set_status(Status(StatusCode.ERROR, str(exc)))
and span.end() (and update duration/exception metrics) so spans are always
marked errored and closed on stream errors.
packages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/__init__.py (1)
256-269: ⚠️ Potential issue | 🟠 Major

Stream exception handling bypasses record_exception() and span.end() for mid-stream failures.

The try/except blocks at lines 269–276 and 355–361 only catch exceptions during generator creation, not during chunk iteration. Exceptions raised while consuming _create_stream_processor() or _create_async_stream_processor() will not trigger record_exception(), set_status(StatusCode.ERROR), or span.end() since those operations are only called after normal loop completion. The for/async for loops inside these processors must wrap iteration in try/except, plus add a regression test for exceptions raised after the first chunk.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/__init__.py`
around lines 256 - 269, The stream processors (_create_stream_processor and
_create_async_stream_processor) only handle exceptions during generator creation
but not during iteration, so wrap the actual for/async for iteration inside a
try/except that mirrors the existing exception handler: on any exception compute
end_time, build attributes via error_metrics_attributes(e), record duration to
duration_histogram if present, call span.record_exception(e),
span.set_status(Status(StatusCode.ERROR, str(e))), span.end(), then re-raise;
ensure the normal span.end() remains on normal completion. Also add a regression
test that raises an exception after yielding the first chunk to verify
record_exception, set_status and span.end are invoked for mid-stream failures.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@packages/opentelemetry-instrumentation-mistralai/opentelemetry/instrumentation/mistralai/__init__.py`:
- Around line 417-423: The streaming accumulator generators
_accumulate_streaming_response and _aaccumulate_streaming_response currently
iterate without protection, so exceptions raised mid-stream leave spans open;
update both functions to wrap their iteration loop in a try/except/finally: in
except call span.record_exception(e) and
span.set_status(Status(StatusCode.ERROR, str(e))) then re-raise, and in finally
ensure span.end() and any _handle_response() cleanup always run; also add a unit
test that simulates an exception thrown mid-stream (e.g., a generator that
raises partway through) to assert record_exception, set_status and span.end are
invoked to prevent regression.

---

Outside diff comments:
In
`@packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py`:
- Around line 567-581: The sync streaming path and context-manager wrappers must
mirror the async error-handling: in AnthropicStream.__next__ ensure that when an
exception occurs you call span.record_exception(e),
span.set_status(Status(StatusCode.ERROR, str(e))) and span.end() (and record
duration_histogram/exception_counter like the async path) before re-raising;
similarly, in WrappedMessageStreamManager.__exit__ and
WrappedAsyncMessageStreamManager.__aexit__ detect if an exception is propagating
(non-None exc_type/exception) and, if a span exists on the wrapper, call
span.record_exception(exc), span.set_status(Status(StatusCode.ERROR, str(exc)))
and span.end() (and update duration/exception metrics) so spans are always
marked errored and closed on stream errors.

In
`@packages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/__init__.py`:
- Around line 256-269: The stream processors (_create_stream_processor and
_create_async_stream_processor) only handle exceptions during generator creation
but not during iteration, so wrap the actual for/async for iteration inside a
try/except that mirrors the existing exception handler: on any exception compute
end_time, build attributes via error_metrics_attributes(e), record duration to
duration_histogram if present, call span.record_exception(e),
span.set_status(Status(StatusCode.ERROR, str(e))), span.end(), then re-raise;
ensure the normal span.end() remains on normal completion. Also add a regression
test that raises an exception after yielding the first chunk to verify
record_exception, set_status and span.end are invoked for mid-stream failures.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7db9efe9-3b41-47dd-95e1-3740189167fc

📥 Commits

Reviewing files that changed from the base of the PR and between 786d49f and fa7e018.

📒 Files selected for processing (3)

packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py
packages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/__init__.py
packages/opentelemetry-instrumentation-mistralai/opentelemetry/instrumentation/mistralai/__init__.py

coderabbitai · 2026-04-10T17:14:27Z

.../opentelemetry-instrumentation-mistralai/opentelemetry/instrumentation/mistralai/__init__.py

+    try:
+        response = wrapped(*args, **kwargs)
+    except Exception as e:
+        span.record_exception(e)
+        span.set_status(Status(StatusCode.ERROR, str(e)))
+        span.end()
+        raise


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cd packages/opentelemetry-instrumentation-mistralai && wc -l opentelemetry/instrumentation/mistralai/__init__.py

Repository: traceloop/openllmetry

Length of output: 120

🏁 Script executed:

cd packages/opentelemetry-instrumentation-mistralai && sed -n '400,480p' opentelemetry/instrumentation/mistralai/__init__.py

Repository: traceloop/openllmetry

Length of output: 2298

🏁 Script executed:

cd packages/opentelemetry-instrumentation-mistralai && grep -n "_accumulate_streaming_response\|_aaccumulate_streaming_response" opentelemetry/instrumentation/mistralai/__init__.py

Repository: traceloop/openllmetry

Length of output: 310

🏁 Script executed:

cd packages/opentelemetry-instrumentation-mistralai && sed -n '200,280p' opentelemetry/instrumentation/mistralai/__init__.py

Repository: traceloop/openllmetry

Length of output: 3142

🏁 Script executed:

cd packages/opentelemetry-instrumentation-mistralai && sed -n '410,445p' opentelemetry/instrumentation/mistralai/__init__.py

Repository: traceloop/openllmetry

Length of output: 948

🏁 Script executed:

cd packages/opentelemetry-instrumentation-mistralai && sed -n '460,495p' opentelemetry/instrumentation/mistralai/__init__.py

Repository: traceloop/openllmetry

Length of output: 1066

Stream consumption failures bypass span cleanup.

The new exception handlers (lines 417–423 and 469–478) only catch errors while obtaining the response object via wrapped(). When streaming is enabled, control is returned to the caller via generator functions (_accumulate_streaming_response and _aaccumulate_streaming_response) that iterate the stream without any try/except/finally wrapper. If an exception occurs during iteration—such as a network error, malformed chunk, or timeout—the exception propagates to the caller while the generator's remaining code (including _handle_response() and span.end()) is never executed. This leaves the span open and unrecorded.

Both accumulator functions must wrap their iteration loops in try/except/finally to ensure record_exception(), set_status(), and span.end() are always called. Add a test that raises an exception mid-stream to prevent regression.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-mistralai/opentelemetry/instrumentation/mistralai/__init__.py` around lines 417 - 423, The streaming accumulator generators _accumulate_streaming_response and _aaccumulate_streaming_response currently iterate without protection, so exceptions raised mid-stream leave spans open; update both functions to wrap their iteration loop in a try/except/finally: in except call span.record_exception(e) and span.set_status(Status(StatusCode.ERROR, str(e))) then re-raise, and in finally ensure span.end() and any _handle_response() cleanup always run; also add a unit test that simulates an exception thrown mid-stream (e.g., a generator that raises partway through) to assert record_exception, set_status and span.end are invoked to prevent regression.

coderabbitai bot reviewed Apr 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: record errors on spans for anthropic, groq, and mistralai#3970

fix: record errors on spans for anthropic, groq, and mistralai#3970
Dheeraj-Bhaskaruni wants to merge 1 commit intotraceloop:mainfrom
Dheeraj-Bhaskaruni:fix/add-error-logging-to-spans

Dheeraj-Bhaskaruni commented Apr 10, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Apr 10, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

CLAassistant commented Apr 10, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Dheeraj-Bhaskaruni commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Files changed

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

CLAassistant commented Apr 10, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Dheeraj-Bhaskaruni commented Apr 10, 2026 •

edited

Loading

coderabbitai bot commented Apr 10, 2026 •

edited

Loading