Skip to content

fix(google-genai): migrate span attrs to gen_ai.input/output.messages (fixes #3515)#3948

Open
abhyudayareddy wants to merge 8 commits intotraceloop:mainfrom
abhyudayareddy:fix/google-genai-span-attr-migration
Open

fix(google-genai): migrate span attrs to gen_ai.input/output.messages (fixes #3515)#3948
abhyudayareddy wants to merge 8 commits intotraceloop:mainfrom
abhyudayareddy:fix/google-genai-span-attr-migration

Conversation

@abhyudayareddy
Copy link
Copy Markdown

@abhyudayareddy abhyudayareddy commented Apr 6, 2026

Summary

Fixes #3515.

Migrates google_generativeai/span_utils.py from deprecated indexed span attributes to the stable OTel GenAI Semantic Conventions:

  • gen_ai.prompt.{N}.content / gen_ai.prompt.{N}.role -> gen_ai.input.messages (JSON array)
  • gen_ai.completion.{N}.content / gen_ai.completion.{N}.role -> gen_ai.output.messages (JSON array)

The new attributes store a JSON-serialised list of {"role": "...", "parts": [...]} objects, consistent with the format used by the OpenAI instrumentation in this repo.

Changes

  • span_utils.py: Removed _set_prompt_attributes; set_input_attributes and set_input_attributes_sync now build an input_messages list and write it to gen_ai.input.messages; set_response_attributes builds an output_messages list and writes it to gen_ai.output.messages.
  • tests/test_generate_content.py: Updated assertions to parse the new JSON blob attributes instead of the old indexed ones.

Checklist

  • I have added tests that cover my changes.
  • If adding a new instrumentation or changing an existing one, I've added screenshots from some observability platform showing the change.
  • PR name follows conventional commits format: fix(instrumentation): ...
  • (If applicable) I have updated the documentation accordingly.

Summary by CodeRabbit

  • Refactor

    • Input/output now emitted as aggregated, JSON-encoded input_messages and output_messages arrays; text parts use a consistent {role, parts} schema.
    • Image parts standardized to an OpenAI-style image_url representation.
    • Model penalties now recorded under LLM_* attributes; response_id and deprecated finish-reason attributes removed; token-provider labeling adjusted to "Google".
  • Tests

    • Tests updated to validate aggregated JSON attributes, token usage under LLM_USAGE_TOTAL_TOKENS, and assert deprecated per-index attributes are not emitted.

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 6, 2026

CLA assistant check
All committers have signed the CLA.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 6, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Refactors span attribute emission to stop writing per-index gen_ai.prompt.* and gen_ai.completion.*. Inputs are aggregated into a single JSON gen_ai.input.messages attribute and outputs into gen_ai.output.messages. Image parts and text parts normalized to OpenAI-style message/part objects; tests updated accordingly.

Changes

Cohort / File(s) Summary
Span utils (Google GenAI instrumentation)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py
Rewrote input/output processing pipeline: removed legacy per-part extractors and indexed-span-attribute emission; introduced async helpers producing OpenAI-style part objects (type, content) and image_url structure; sync image path adjusted to return image_url; aggregates all inputs into input_messages and all assistant outputs into output_messages, serialized once to GenAIAttributes.GEN_AI_INPUT_MESSAGES / GenAIAttributes.GEN_AI_OUTPUT_MESSAGES. set_model_request_attributes now sets presence/frequency penalties to SpanAttributes.LLM_PRESENCE_PENALTY / LLM_FREQUENCY_PENALTY. set_model_response_attributes removed emission of response_id and deprecated finish-reasons attributes; token histogram provider/op-name sources adjusted (second provider name now "Google").
Tests
packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py
Updated span assertions to decode and validate aggregated GenAIAttributes.GEN_AI_INPUT_MESSAGES and GEN_AI_OUTPUT_MESSAGES JSON: assert first input role is "user" and first output role is "assistant", parts contain type=="text" and content field; assert no emitted attributes start with gen_ai.prompt. or gen_ai.completion.; switched token usage assertion to SpanAttributes.LLM_USAGE_TOTAL_TOKENS. Removed unused MagicMock import; retained/added import json.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐇 I nibbled at spans and stitched the message threads,
Packed prompts and answers into one little bed,
Texts say "content", images point to a URL,
A tidy JSON basket — the rabbit gives a twirl. 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(google-genai): migrate span attrs to gen_ai.input/output.messages (fixes #3515)' directly describes the main change of migrating from deprecated span attributes to new ones, addressing issue #3515.
Linked Issues check ✅ Passed The PR fully satisfies the coding objectives from #3515: replaces deprecated gen_ai.prompt/completion with gen_ai.input/output.messages in JSON-serialized format, removes _set_prompt_attributes, updates input/output attribute setting logic, and improves test coverage to validate the new message structure.
Out of Scope Changes check ✅ Passed All changes are directly aligned with the scope of #3515—migrating to new GenAI semantic conventions. The refactoring of content/part processing, removal of deprecated functions, and test updates are all necessary to complete the migration and do not introduce unrelated functionality.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py (1)

49-57: Strengthen schema assertions for migrated message payloads.

Current checks only validate role; they won’t catch regressions in parts structure (type/content) that this migration depends on.

✅ Suggested test hardening
     assert GenAIAttributes.GEN_AI_INPUT_MESSAGES in attrs
     input_msgs = json.loads(attrs[GenAIAttributes.GEN_AI_INPUT_MESSAGES])
     assert len(input_msgs) > 0
     assert input_msgs[0]["role"] == "user"
+    assert isinstance(input_msgs[0].get("parts"), list) and len(input_msgs[0]["parts"]) > 0
+    assert input_msgs[0]["parts"][0]["type"] == "text"
+    assert "content" in input_msgs[0]["parts"][0]

     assert GenAIAttributes.GEN_AI_OUTPUT_MESSAGES in attrs
     output_msgs = json.loads(attrs[GenAIAttributes.GEN_AI_OUTPUT_MESSAGES])
     assert len(output_msgs) > 0
     assert output_msgs[0]["role"] == "assistant"
+    assert isinstance(output_msgs[0].get("parts"), list) and len(output_msgs[0]["parts"]) > 0
+    assert output_msgs[0]["parts"][0]["type"] == "text"
+    assert "content" in output_msgs[0]["parts"][0]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py`
around lines 49 - 57, The tests currently only assert the "role" field for
messages stored under GenAIAttributes.GEN_AI_INPUT_MESSAGES and
GenAIAttributes.GEN_AI_OUTPUT_MESSAGES; extend these assertions to validate the
migrated message payload schema by checking each message's "parts" structure
(e.g., ensure input_msgs[0] and output_msgs[0] have a "parts" list, that
parts[0] contains "type" and "content" keys, and that "content" is the expected
string or non-empty value), so the test fails if the required type/content
fields are missing or empty.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`:
- Around line 409-433: The code currently hardcodes "finish_reason": "stop" for
every appended message; instead, only include a finish_reason field when the
non-streaming Google response provides one (use
response.candidates[0].finish_reason when present) and omit finish_reason for
streaming paths (where response was a list/str or built from response.text).
Update the branches that handle the original non-streaming response object (the
branches referencing response and response.text and appending into
output_messages) to read finish = getattr(response, "candidates", None) and, if
finish and finish[0].finish_reason is truthy, include "finish_reason":
finish[0].finish_reason in the dict; otherwise do not add the finish_reason key.
Leave the streaming branches that handle list/str (the paths enumerating
response or using response as str) unchanged so they do not include
finish_reason.

---

Nitpick comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py`:
- Around line 49-57: The tests currently only assert the "role" field for
messages stored under GenAIAttributes.GEN_AI_INPUT_MESSAGES and
GenAIAttributes.GEN_AI_OUTPUT_MESSAGES; extend these assertions to validate the
migrated message payload schema by checking each message's "parts" structure
(e.g., ensure input_msgs[0] and output_msgs[0] have a "parts" list, that
parts[0] contains "type" and "content" keys, and that "content" is the expected
string or non-empty value), so the test fails if the required type/content
fields are missing or empty.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1215e26d-911c-4d0f-b60a-791811eb5b38

📥 Commits

Reviewing files that changed from the base of the PR and between 0a25803 and 7b328ac.

📒 Files selected for processing (2)
  • packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py
  • packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py

…n unknown

Updated output message construction to include finish_reason when available and adjusted handling for both non-streaming and streaming responses.
… in messages

Add assertions to validate structure of input and output messages in tests.
@abhyudayareddy
Copy link
Copy Markdown
Author

Addressed both CodeRabbit review items in the latest two commits:

  1. finish_reason fix (4862ab3): replaced the hardcoded "finish_reason": "stop" with dynamic extraction from response.candidates[0].finish_reason for non-streaming responses (using .name.lower() on the enum), and omitted finish_reason entirely for streaming paths where it isn't reliably available per-chunk.

  2. Stronger test assertions (e6f86a8): added parts structure validation for both input and output messages — checks that parts is a non-empty list, that parts[0]["type"] == "text", and that "content" is present.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py (1)

49-63: Assert that the deprecated indexed attributes are gone.

These checks only prove the new JSON blobs exist. They would still pass if gen_ai.prompt.* / gen_ai.completion.* were emitted alongside them, which leaves the core migration behavior unguarded.

🧪 Minimal regression guard
     assert GenAIAttributes.GEN_AI_INPUT_MESSAGES in attrs
+    assert not any(key.startswith("gen_ai.prompt.") for key in attrs.keys())
     input_msgs = json.loads(attrs[GenAIAttributes.GEN_AI_INPUT_MESSAGES])

     assert GenAIAttributes.GEN_AI_OUTPUT_MESSAGES in attrs
+    assert not any(key.startswith("gen_ai.completion.") for key in attrs.keys())
     output_msgs = json.loads(attrs[GenAIAttributes.GEN_AI_OUTPUT_MESSAGES])
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py`
around lines 49 - 63, Add assertions to ensure deprecated indexed attributes are
not present: after loading attrs and verifying
GenAIAttributes.GEN_AI_INPUT_MESSAGES and
GenAIAttributes.GEN_AI_OUTPUT_MESSAGES, assert that no keys in attrs start with
"gen_ai.prompt." or "gen_ai.completion." (e.g., use
any(k.startswith("gen_ai.prompt.") or k.startswith("gen_ai.completion.") for k
in attrs) and assert that this is False). Reference the attrs variable and the
existing GenAIAttributes.GEN_AI_INPUT_MESSAGES /
GenAIAttributes.GEN_AI_OUTPUT_MESSAGES checks to locate where to add these
assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`:
- Around line 202-218: The current handling of kwargs["contents"] in this block
(and the similar block at 253-303) only handles str vs list and treats every
list item as a separate turn, which misclassifies single Content objects and
iterables of Part objects; change the normalization so that you first detect
whether contents is a Content-like object (has attributes role and parts) and
treat it as one turn, detect whether contents is a single Part (has attributes
type/content) and wrap it into a single turn with default role "user", and
detect whether contents is an iterable: if the iterable's items are Content
objects treat each as a separate turn, but if the iterable's items are Part
objects aggregate them into one turn. Use the existing _process_content_item to
process parts, preserve role via getattr(content_item, "role", "user"), and
append consistent dicts to input_messages with keys "role" and "parts"; apply
the same normalization logic in both the block around input_messages
construction and the similar code at lines 253-303.

---

Nitpick comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py`:
- Around line 49-63: Add assertions to ensure deprecated indexed attributes are
not present: after loading attrs and verifying
GenAIAttributes.GEN_AI_INPUT_MESSAGES and
GenAIAttributes.GEN_AI_OUTPUT_MESSAGES, assert that no keys in attrs start with
"gen_ai.prompt." or "gen_ai.completion." (e.g., use
any(k.startswith("gen_ai.prompt.") or k.startswith("gen_ai.completion.") for k
in attrs) and assert that this is False). Reference the attrs variable and the
existing GenAIAttributes.GEN_AI_INPUT_MESSAGES /
GenAIAttributes.GEN_AI_OUTPUT_MESSAGES checks to locate where to add these
assertions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 970a054d-55c6-4934-90f0-ee656e5f647a

📥 Commits

Reviewing files that changed from the base of the PR and between 7b328ac and e6f86a8.

📒 Files selected for processing (2)
  • packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py
  • packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py

…bsent

Adds negative assertions to test_client_spans to verify deprecated gen_ai.prompt.{N} and gen_ai.completion.{N} attributes are absent after migrating to gen_ai.input.messages / gen_ai.output.messages. Addresses CodeRabbit review feedback (round 2).Adds negative assertions to test_client_spans to guard against deprecated gen_ai.prompt.{N} and gen_ai.completion.{N} attributes leaking back in after the migration to gen_ai.input.messages / gen_ai.output.messages. Addresses CodeRabbit review feedback.Added assertions to verify deprecated attributes are not emitted.
…tent] from list[Part] in contents normalization

Addresses CodeRabbit major review comment on contents normalization:

1. Adds elif hasattr(contents, "parts") branch to handle a single Content object passed directly (not wrapped in a list).

2. In the elif isinstance(contents, list) branch, sniffs contents[0] to distinguish multi-turn conversations (list[Content], each with .parts) from single-turn inputs (list[Part] or list[str]). Multi-turn: each Content becomes a separate message. Single-turn: all Parts are collected into one user message.

Applied symmetrically to both the async set_input_attributes and sync set_input_attributes_sync.
@abhyudayareddy
Copy link
Copy Markdown
Author

Addressed all CodeRabbit round-2 feedback (commits 08f2f74 and 56f5e51):

test_generate_content.py (commit 08f2f74): Added negative assertions after the existing gen_ai.input.messages / gen_ai.output.messages checks to guard against regression to the deprecated indexed attributes:

assert not any(key.startswith("gen_ai.prompt.") for key in attrs.keys())
assert not any(key.startswith("gen_ai.completion.") for key in attrs.keys())

span_utils.py (commit 56f5e51): Fixed contents normalization to correctly distinguish between input shapes:

  • str → single user message (unchanged)
  • Content object (has .parts): single turn, wrapped as one message
  • list[Content] (each item has .parts): multi-turn conversation, each item becomes a separate message with its own role
  • list[Part] or list[str]: single-turn, all items collected into one role: user message with multiple parts

The fix applies to both the async set_input_attributes and the sync set_input_attributes_sync paths.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (1)

123-138: ⚠️ Potential issue | 🟠 Major

Normalize Part inputs before serializing gen_ai.input.messages.

The SDK accepts contents as a single Part, groups non-function-call parts into a user turn, groups function-call parts into a model turn, and allows mixed ContentUnion lists to alternate between those grouped turns. This code still only special-cases str / Content / list, forces non-Content lists to "user", and falls back to str(part) for direct Part-like values, so valid tool-calling or multimodal inputs can be dropped or mis-serialized. (googleapis.github.io)

Also applies to: 202-239, 274-340

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`
around lines 123 - 138, This block mishandles Part-like inputs by only
special-casing str/list and falling back to str(content_item), which can drop or
mis-serialize tool-calling or multimodal Parts; update the processing around
content_item to first normalize any Part/Part-like or ContentUnion instances
into canonical message parts (preserving function-call vs non-function-call
semantics and list alternation rules used by gen_ai.input.messages) before
serializing: detect Part objects (instead of treating them as generic objects),
route image Parts through _is_image_part/_process_image_part as before, group
non-function-call Parts into user turns and function-call Parts into model
turns, and handle single Part or single-list inputs identically to lists so you
never fall back to processed_content.append({"type":"text","content":
str(content_item)}); apply the same normalization logic to the other processing
sites referenced in the review (the blocks around lines handling content lists)
to ensure consistent serialization.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`:
- Around line 445-459: The code currently reads a single finish_reason from
_candidates[0] and reuses it for every message in the multi-message branch;
change logic so each message uses its own candidate's finish_reason: iterate
over _candidates (or ensure the loop enumerates response/candidates) and for
each item use that item.finish_reason (handling hasattr(item.finish_reason,
"name") as done now) to set msg["finish_reason"] before appending to
output_messages; update references to response.text/list branching and the
variables _candidates, fr, and _finish_reason accordingly so each candidate's
finish_reason is applied to its own msg.
- Around line 443-487: The handler only serializes response.text; update the
logic in span_utils.py (the block building output_messages used before calling
_set_span_attribute with GenAIAttributes.GEN_AI_OUTPUT_MESSAGES) to iterate
response.parts (and response.function_calls when present) in addition to
response.text so all part types (function_call, inline_data, file_data,
executable_code, code_execution_result, etc.) are captured into the parts list
structure, preserving existing fields like "type", "content" (or structured
payload for function calls), and adding finish_reason when available from
response.candidates; ensure GenerateContentResponse.parts entries are normalized
into the same message format currently used for text so downstream consumers
receive complete gen_ai.output.messages.

---

Duplicate comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`:
- Around line 123-138: This block mishandles Part-like inputs by only
special-casing str/list and falling back to str(content_item), which can drop or
mis-serialize tool-calling or multimodal Parts; update the processing around
content_item to first normalize any Part/Part-like or ContentUnion instances
into canonical message parts (preserving function-call vs non-function-call
semantics and list alternation rules used by gen_ai.input.messages) before
serializing: detect Part objects (instead of treating them as generic objects),
route image Parts through _is_image_part/_process_image_part as before, group
non-function-call Parts into user turns and function-call Parts into model
turns, and handle single Part or single-list inputs identically to lists so you
never fall back to processed_content.append({"type":"text","content":
str(content_item)}); apply the same normalization logic to the other processing
sites referenced in the review (the blocks around lines handling content lists)
to ensure consistent serialization.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 31c464ec-5fbf-4173-853e-e80ee36d116d

📥 Commits

Reviewing files that changed from the base of the PR and between e6f86a8 and 56f5e51.

📒 Files selected for processing (2)
  • packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py
  • packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py
✅ Files skipped from review due to trivial changes (1)
  • packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py

@abhyudayareddy
Copy link
Copy Markdown
Author

Addressed CodeRabbit round-3 feedback in commit 6dc73dc:

Item 1 (Major — Handle all response.parts types): Added a _serialize_response_part(part) helper that serializes text parts, function_call parts (with name + args), and falls back to str(part) for other types. set_response_attributes now iterates over response.candidates, reads each candidate's content.parts, and serializes all parts through this helper — so tool-call and multimodal responses are captured in gen_ai.output.messages instead of being dropped.

Item 2 (Minor — Per-candidate finish_reason): Removed the single shared _finish_reason extracted from candidates[0]. Each candidate now reads its own finish_reason field inside the loop, so multi-candidate responses get the correct stop reason per message instead of all inheriting the first candidate's reason.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (1)

123-139: ⚠️ Potential issue | 🟠 Major

Structured inputs are still dropped or flattened outside the main contents path.

kwargs["contents"] still only matches str, objects with .parts, or list, so a single part-like object falls through without emitting anything. The list and positional-args branches also serialize .text / .parts objects via str(...) instead of the existing content/part serializers, which loses real text, roles, and turn boundaries in gen_ai.input.messages. Please funnel both kwargs and positional args through one shared normalizer.

Also applies to: 158-189, 202-248, 282-378

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`
around lines 123 - 139, The code currently flattens or drops structured inputs
by only handling str, .parts lists, or list types and by using str(...) instead
of the proper serializers; update span_utils.py to funnel both
kwargs["contents"] and all positional args through a single shared normalizer
function (create a helper like _normalize_contents used by the blocks around
processed_content) that: detects single part-like objects (use _is_image_part)
and other content types, calls the existing serializers for content/part instead
of str(), and routes image parts to _process_image_part with
span.context.trace_id/span_id; replace the duplicated handling in the branches
referenced (including the shown block and the ranges 158-189, 202-248, 282-378)
to call this shared normalizer so gen_ai.input.messages preserves text, roles,
and turn boundaries.
🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (1)

441-447: Align tool-call parts with the shared message schema.

This serializer emits {"type": "function_call"}, but packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py:57-94 already uses {"type": "tool_call"} for the same cross-provider part shape. Using a different type here makes downstream parsers provider-specific again.

♻️ Suggested change
         return {
-            "type": "function_call",
+            "type": "tool_call",
             "name": fc.name,
             "arguments": dict(fc.args) if hasattr(fc, "args") else {},
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`
around lines 441 - 447, The serializer currently returns a dict with "type":
"function_call" for parts where part.function_call is present; change the
emitted type to "tool_call" and keep the same shape (keys "name" and
"arguments") so the output matches the cross-provider schema used by the other
implementation (see the handling of part.function_call -> fc.name and fc.args in
span_utils). Update the return inside the if hasattr(part, "function_call")
block to emit "type": "tool_call" and preserve converting fc.args to a dict when
present.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`:
- Around line 473-500: The candidate-level fallback that assigns response.text
when parts is empty should be moved out of the per-candidate loop so one
candidate cannot inherit another candidate’s text; instead, only if no candidate
produced any parts at all use the response-wide text to create a single
assistant message. Also fix the second fallback branch to check
isinstance(response.text, list) and iterate over response.text (not response)
when building messages. Update logic around parts, candidates iteration and the
output_messages append (references: parts, candidates loop, response.text,
output_messages, _finish_reason) accordingly.

---

Duplicate comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`:
- Around line 123-139: The code currently flattens or drops structured inputs by
only handling str, .parts lists, or list types and by using str(...) instead of
the proper serializers; update span_utils.py to funnel both kwargs["contents"]
and all positional args through a single shared normalizer function (create a
helper like _normalize_contents used by the blocks around processed_content)
that: detects single part-like objects (use _is_image_part) and other content
types, calls the existing serializers for content/part instead of str(), and
routes image parts to _process_image_part with span.context.trace_id/span_id;
replace the duplicated handling in the branches referenced (including the shown
block and the ranges 158-189, 202-248, 282-378) to call this shared normalizer
so gen_ai.input.messages preserves text, roles, and turn boundaries.

---

Nitpick comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`:
- Around line 441-447: The serializer currently returns a dict with "type":
"function_call" for parts where part.function_call is present; change the
emitted type to "tool_call" and keep the same shape (keys "name" and
"arguments") so the output matches the cross-provider schema used by the other
implementation (see the handling of part.function_call -> fc.name and fc.args in
span_utils). Update the return inside the if hasattr(part, "function_call")
block to emit "type": "tool_call" and preserve converting fc.args to a dict when
present.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8f62549c-6d10-4524-98d5-1d4a7bc278d8

📥 Commits

Reviewing files that changed from the base of the PR and between 56f5e51 and 6dc73dc.

📒 Files selected for processing (1)
  • packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py

Comment on lines +473 to +500
if not parts:
# Fallback: try response.text for simple single-part responses
try:
text = response.text
if text:
parts = [{"type": "text", "content": text}]
except Exception:
pass
if parts:
msg = {"role": "assistant", "parts": parts}
if _finish_reason:
msg["finish_reason"] = _finish_reason
output_messages.append(msg)
else:
# No candidates field: fall back to response.text
try:
if isinstance(response.text, list):
for item in response:
output_messages.append({
"role": "assistant",
"parts": [{"type": "text", "content": item.text}],
})
elif isinstance(response.text, str):
output_messages.append({
"role": "assistant",
"parts": [{"type": "text", "content": response.text}],
})
except Exception:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Move the response.text fallback out of the candidate loop.

The current fallback is response-wide, so any candidate with empty content.parts can inherit another candidate’s text. The second fallback also checks isinstance(response.text, list) but iterates response, which will quietly drop output when the response object itself is not iterable.

🛠️ Suggested fix
-                if not parts:
+                if not parts and len(_candidates) == 1:
                     # Fallback: try response.text for simple single-part responses
                     try:
                         text = response.text
                         if text:
                             parts = [{"type": "text", "content": text}]
                     except Exception:
                         pass
-                if isinstance(response.text, list):
-                    for item in response:
+                if isinstance(response.text, list):
+                    for item in response.text:
                         output_messages.append({
                             "role": "assistant",
-                            "parts": [{"type": "text", "content": item.text}],
+                            "parts": [{"type": "text", "content": getattr(item, "text", item)}],
                         })
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`
around lines 473 - 500, The candidate-level fallback that assigns response.text
when parts is empty should be moved out of the per-candidate loop so one
candidate cannot inherit another candidate’s text; instead, only if no candidate
produced any parts at all use the response-wide text to create a single
assistant message. Also fix the second fallback branch to check
isinstance(response.text, list) and iterate over response.text (not response)
when building messages. Update logic around parts, candidates iteration and the
output_messages append (references: parts, candidates loop, response.text,
output_messages, _finish_reason) accordingly.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py (1)

167-168: ⚠️ Potential issue | 🔴 Critical

Critical: MagicMock used but not imported.

The test uses MagicMock() on line 167, but there's no import for it. This will cause NameError: name 'MagicMock' is not defined when running this test.

🐛 Proposed fix - add import
 import json
 
 import pytest
+from unittest.mock import MagicMock
 from opentelemetry.trace import StatusCode, SpanKind
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py`
around lines 167 - 168, The test uses MagicMock (see span = MagicMock() in
tests/test_generate_content.py) but MagicMock is not imported; add an import for
MagicMock (e.g., from unittest.mock import MagicMock) at the top of the test
module so the span = MagicMock() and span.is_recording.return_value = True lines
can run without NameError.
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (2)

720-735: ⚠️ Potential issue | 🔴 Critical

Critical: Variable um used but never assigned, causing NameError.

The code checks hasattr(response, "usage_metadata") but never assigns um. Lines 724, 729, and 734 reference um.total_token_count, um.candidates_token_count, and um.prompt_token_count which will fail at runtime.

🐛 Proposed fix
     _set_span_attribute(span, GenAIAttributes.GEN_AI_RESPONSE_MODEL, llm_model)
     if hasattr(response, "usage_metadata"):
+        um = response.usage_metadata
         _set_span_attribute(
             span,
             SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS,
             um.total_token_count,
         )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`
around lines 720 - 735, The code references an unassigned variable um when
reading usage fields; fix by reading the response.usage_metadata into a local
variable (e.g., um = response.usage_metadata) immediately after the
hasattr(response, "usage_metadata") check, guard for None, then call
_set_span_attribute using um.total_token_count, um.candidates_token_count, and
um.prompt_token_count; update the block around response, um, and
_set_span_attribute (and leave SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS /
GenAIAttributes.GEN_AI_USAGE_OUTPUT_TOKENS /
GenAIAttributes.GEN_AI_USAGE_INPUT_TOKENS unchanged) so it no longer raises
NameError.

740-741: ⚠️ Potential issue | 🔴 Critical

Add missing imports for _GCP_GEN_AI and _GEN_CONTENT constants.

These constants are used at lines 740-741 in span_utils.py but are not imported. They are defined in __init__.py and should be imported from there to avoid NameError at runtime.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`
around lines 740 - 741, Import the missing constants _GCP_GEN_AI and
_GEN_CONTENT into span_utils.py so they are available where used in the
attributes dict (GenAIAttributes.GEN_AI_PROVIDER_NAME: _GCP_GEN_AI,
GenAIAttributes.GEN_AI_OPERATION_NAME: _GEN_CONTENT); add them to the
module-level imports by importing _GCP_GEN_AI and _GEN_CONTENT from the package
root (where they are defined in __init__.py) alongside the other constants to
prevent NameError at runtime.
♻️ Duplicate comments (2)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (2)

663-670: ⚠️ Potential issue | 🟡 Minor

Response-wide response.text fallback inside per-candidate loop may cause incorrect attribution.

The fallback to response.text (lines 665-668) is inside the per-candidate loop. For multi-candidate responses, if any candidate has empty content.parts, it will use the response-wide response.text which typically reflects only the first candidate's text. Consider moving this fallback outside the loop or limiting it to single-candidate responses.

🛡️ Suggested guard
                 if not parts:
-                    # Fallback: try response.text for simple single-part responses
-                    try:
-                        text = response.text
-                        if text:
-                            parts = [{"type": "text", "content": text}]
-                    except Exception:
-                        pass
+                    # Fallback only for single-candidate responses
+                    if len(_candidates) == 1:
+                        try:
+                            text = response.text
+                            if text:
+                                parts = [{"type": "text", "content": text}]
+                        except Exception:
+                            pass
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`
around lines 663 - 670, The per-candidate fallback that sets parts from
response.text should not run inside the per-candidate loop because response.text
typically reflects the first candidate and will misattribute text to other
candidates; update the logic in span_utils.py so that you only use response.text
when the response has a single candidate (e.g., check len(response.candidates)
== 1) or move the response.text fallback out of the candidate-processing loop
and apply it only when no candidates produced any parts; adjust the branch that
currently checks "if not parts:" inside the candidate loop to instead use a
guarded check (single-candidate) or perform a post-loop fallback using
response.text.

676-691: ⚠️ Potential issue | 🟠 Major

Fallback branch iterates response instead of response.text.

When response.text is a list (line 679), the code iterates over response (line 680) rather than response.text. Additionally, accessing item.text assumes items are objects with a text attribute, which may not be true if response.text contains strings.

🐛 Proposed fix
         else:
             # No candidates field: fall back to response.text
             try:
                 if isinstance(response.text, list):
-                    for item in response:
+                    for item in response.text:
                         output_messages.append({
                             "role": "assistant",
-                            "parts": [{"type": "text", "content": item.text}],
+                            "parts": [{"type": "text", "content": getattr(item, "text", str(item))}],
                         })
                 elif isinstance(response.text, str):
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`
around lines 676 - 691, In the else fallback in span_utils.py (the block
handling response.text when no candidates field exists), fix the loop to iterate
over response.text (not response) and handle list item types safely: for each
item in response.text, if the item is a string use it directly as the "content",
otherwise if it has a .text attribute use that, and skip or stringify other
types; update the branch inside the function that builds output_messages so it
checks isinstance(response.text, list) and accesses items correctly instead of
using item.text blindly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`:
- Around line 238-261: The function _process_content_item uses an undefined
accumulator and helper: replace the erroneous parts_acc usage with the local
processed_content (i.e., call processed_content.extend(...)) and ensure the
helper _parts_from_genai_part_async is provided — either implement/import a
coroutine named _parts_from_genai_part_async(part, span, part_index) that
returns a list of processed parts or rename the call to the correct existing
async helper if it already exists; update _process_content_item to extend
processed_content with the awaited result and return processed_content.

In
`@packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py`:
- Around line 76-78: The test asserts SpanAttributes.LLM_USAGE_TOTAL_TOKENS
while the implementation sets SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS, causing
a mismatch; fix by making them consistent: either update the test to assert
SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS (and related GEN_AI_* attributes) or
update the implementation (the code that assigns token attributes in
span_utils.py) to also set the LLM_USAGE_* attributes (or set both keys) so both
constants are present; ensure you update or add the corresponding assertions for
input/output/total tokens (SpanAttributes.LLM_USAGE_TOTAL_TOKENS,
GenAIAttributes.GEN_AI_USAGE_INPUT_TOKENS,
GenAIAttributes.GEN_AI_USAGE_OUTPUT_TOKENS) consistently.

---

Outside diff comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`:
- Around line 720-735: The code references an unassigned variable um when
reading usage fields; fix by reading the response.usage_metadata into a local
variable (e.g., um = response.usage_metadata) immediately after the
hasattr(response, "usage_metadata") check, guard for None, then call
_set_span_attribute using um.total_token_count, um.candidates_token_count, and
um.prompt_token_count; update the block around response, um, and
_set_span_attribute (and leave SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS /
GenAIAttributes.GEN_AI_USAGE_OUTPUT_TOKENS /
GenAIAttributes.GEN_AI_USAGE_INPUT_TOKENS unchanged) so it no longer raises
NameError.
- Around line 740-741: Import the missing constants _GCP_GEN_AI and _GEN_CONTENT
into span_utils.py so they are available where used in the attributes dict
(GenAIAttributes.GEN_AI_PROVIDER_NAME: _GCP_GEN_AI,
GenAIAttributes.GEN_AI_OPERATION_NAME: _GEN_CONTENT); add them to the
module-level imports by importing _GCP_GEN_AI and _GEN_CONTENT from the package
root (where they are defined in __init__.py) alongside the other constants to
prevent NameError at runtime.

In
`@packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py`:
- Around line 167-168: The test uses MagicMock (see span = MagicMock() in
tests/test_generate_content.py) but MagicMock is not imported; add an import for
MagicMock (e.g., from unittest.mock import MagicMock) at the top of the test
module so the span = MagicMock() and span.is_recording.return_value = True lines
can run without NameError.

---

Duplicate comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`:
- Around line 663-670: The per-candidate fallback that sets parts from
response.text should not run inside the per-candidate loop because response.text
typically reflects the first candidate and will misattribute text to other
candidates; update the logic in span_utils.py so that you only use response.text
when the response has a single candidate (e.g., check len(response.candidates)
== 1) or move the response.text fallback out of the candidate-processing loop
and apply it only when no candidates produced any parts; adjust the branch that
currently checks "if not parts:" inside the candidate loop to instead use a
guarded check (single-candidate) or perform a post-loop fallback using
response.text.
- Around line 676-691: In the else fallback in span_utils.py (the block handling
response.text when no candidates field exists), fix the loop to iterate over
response.text (not response) and handle list item types safely: for each item in
response.text, if the item is a string use it directly as the "content",
otherwise if it has a .text attribute use that, and skip or stringify other
types; update the branch inside the function that builds output_messages so it
checks isinstance(response.text, list) and accesses items correctly instead of
using item.text blindly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: eadafac1-a58b-43fc-9f8c-5c6cfeb6ada0

📥 Commits

Reviewing files that changed from the base of the PR and between 6dc73dc and e1f2f5c.

📒 Files selected for processing (2)
  • packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py
  • packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py

Comment on lines 238 to +261
async def _process_content_item(content_item, span):
parts_acc = []
if isinstance(content_item, dict):
for p in content_item.get("parts", []):
if isinstance(p, dict):
parts_acc.append(_map_dict_part(p))
else:
parts_acc.extend(await _parts_from_genai_part_async(p, span, 0))
elif hasattr(content_item, "parts"):
"""Process a single content item, handling different types (Content objects, strings, Parts)"""
processed_content = []
if hasattr(content_item, "parts"):
# Content with parts (Google GenAI Content object)
for part_index, part in enumerate(content_item.parts):
parts_acc.extend(await _parts_from_genai_part_async(part, span, part_index))
elif isinstance(content_item, str):
parts_acc.append({"type": "text", "content": content_item})
# Direct string in the list
processed_content.append({"type": "text", "content": content_item})
elif _is_image_part(content_item):
img = await _otel_image_part_from_genai_part_async(content_item, span, 0)
if img:
parts_acc.append(img)
# Direct Part object that's an image
processed_image = await _process_image_part(
content_item,
span.context.trace_id,
span.context.span_id,
0
)
if processed_image is not None:
processed_content.append(processed_image)
else:
parts_acc.append({"type": "text", "content": str(content_item)})
return parts_acc
# Other content types
processed_content.append({"type": "text", "content": str(content_item)})
return processed_content
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Undefined variable parts_acc and function _parts_from_genai_part_async will cause runtime errors.

The static analysis correctly identifies two undefined names:

  1. Line 244 uses parts_acc but the function initializes processed_content
  2. _parts_from_genai_part_async is called but never defined in this module

This will raise NameError at runtime when processing Content objects with parts.

🐛 Proposed fix
 async def _process_content_item(content_item, span):
     """Process a single content item, handling different types (Content objects, strings, Parts)"""
     processed_content = []
     if hasattr(content_item, "parts"):
         # Content with parts (Google GenAI Content object)
         for part_index, part in enumerate(content_item.parts):
-            parts_acc.extend(await _parts_from_genai_part_async(part, span, part_index))
+            processed_part = await _process_content_part(part, span, part_index)
+            if processed_part:
+                processed_content.append(processed_part)
     elif isinstance(content_item, str):
🧰 Tools
🪛 Ruff (0.15.9)

[error] 244-244: Undefined name parts_acc

(F821)


[error] 244-244: Undefined name _parts_from_genai_part_async

(F821)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`
around lines 238 - 261, The function _process_content_item uses an undefined
accumulator and helper: replace the erroneous parts_acc usage with the local
processed_content (i.e., call processed_content.extend(...)) and ensure the
helper _parts_from_genai_part_async is provided — either implement/import a
coroutine named _parts_from_genai_part_async(part, span, part_index) that
returns a list of processed parts or rename the call to the correct existing
async helper if it already exists; update _process_content_item to extend
processed_content with the awaited result and return processed_content.

Comment on lines +76 to 78
assert attrs[SpanAttributes.LLM_USAGE_TOTAL_TOKENS] > 0
assert attrs[GenAIAttributes.GEN_AI_USAGE_INPUT_TOKENS] > 0
assert attrs[GenAIAttributes.GEN_AI_USAGE_OUTPUT_TOKENS] > 0
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check the SpanAttributes definitions for total tokens attribute
rg -n 'TOTAL_TOKENS|total_tokens' packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py

Repository: traceloop/openllmetry

Length of output: 185


Resolve inconsistent token attribute usage between test and implementation.

The test at lines 76-78 asserts SpanAttributes.LLM_USAGE_TOTAL_TOKENS, but span_utils.py line 723 sets SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS. These are not aliases—they are two distinct attributes with different underlying values ("llm.usage.total_tokens" vs "gen_ai.usage.total_tokens"). Ensure the test and implementation use the same attribute, or verify if both should be set.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py`
around lines 76 - 78, The test asserts SpanAttributes.LLM_USAGE_TOTAL_TOKENS
while the implementation sets SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS, causing
a mismatch; fix by making them consistent: either update the test to assert
SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS (and related GEN_AI_* attributes) or
update the implementation (the code that assigns token attributes in
span_utils.py) to also set the LLM_USAGE_* attributes (or set both keys) so both
constants are present; ensure you update or add the corresponding assertions for
input/output/total tokens (SpanAttributes.LLM_USAGE_TOTAL_TOKENS,
GenAIAttributes.GEN_AI_USAGE_INPUT_TOKENS,
GenAIAttributes.GEN_AI_USAGE_OUTPUT_TOKENS) consistently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🐛 Bug Report: span attributes gen_ai.prompt and gen_ai.completion are deprecated in the latest OpenTelemetry Semantic Conventions

2 participants