fix(google-genai): migrate span attrs to gen_ai.input/output.messages (fixes #3515)#3948
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughRefactors span attribute emission to stop writing per-index Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py (1)
49-57: Strengthen schema assertions for migrated message payloads.Current checks only validate
role; they won’t catch regressions inpartsstructure (type/content) that this migration depends on.✅ Suggested test hardening
assert GenAIAttributes.GEN_AI_INPUT_MESSAGES in attrs input_msgs = json.loads(attrs[GenAIAttributes.GEN_AI_INPUT_MESSAGES]) assert len(input_msgs) > 0 assert input_msgs[0]["role"] == "user" + assert isinstance(input_msgs[0].get("parts"), list) and len(input_msgs[0]["parts"]) > 0 + assert input_msgs[0]["parts"][0]["type"] == "text" + assert "content" in input_msgs[0]["parts"][0] assert GenAIAttributes.GEN_AI_OUTPUT_MESSAGES in attrs output_msgs = json.loads(attrs[GenAIAttributes.GEN_AI_OUTPUT_MESSAGES]) assert len(output_msgs) > 0 assert output_msgs[0]["role"] == "assistant" + assert isinstance(output_msgs[0].get("parts"), list) and len(output_msgs[0]["parts"]) > 0 + assert output_msgs[0]["parts"][0]["type"] == "text" + assert "content" in output_msgs[0]["parts"][0]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py` around lines 49 - 57, The tests currently only assert the "role" field for messages stored under GenAIAttributes.GEN_AI_INPUT_MESSAGES and GenAIAttributes.GEN_AI_OUTPUT_MESSAGES; extend these assertions to validate the migrated message payload schema by checking each message's "parts" structure (e.g., ensure input_msgs[0] and output_msgs[0] have a "parts" list, that parts[0] contains "type" and "content" keys, and that "content" is the expected string or non-empty value), so the test fails if the required type/content fields are missing or empty.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`:
- Around line 409-433: The code currently hardcodes "finish_reason": "stop" for
every appended message; instead, only include a finish_reason field when the
non-streaming Google response provides one (use
response.candidates[0].finish_reason when present) and omit finish_reason for
streaming paths (where response was a list/str or built from response.text).
Update the branches that handle the original non-streaming response object (the
branches referencing response and response.text and appending into
output_messages) to read finish = getattr(response, "candidates", None) and, if
finish and finish[0].finish_reason is truthy, include "finish_reason":
finish[0].finish_reason in the dict; otherwise do not add the finish_reason key.
Leave the streaming branches that handle list/str (the paths enumerating
response or using response as str) unchanged so they do not include
finish_reason.
---
Nitpick comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py`:
- Around line 49-57: The tests currently only assert the "role" field for
messages stored under GenAIAttributes.GEN_AI_INPUT_MESSAGES and
GenAIAttributes.GEN_AI_OUTPUT_MESSAGES; extend these assertions to validate the
migrated message payload schema by checking each message's "parts" structure
(e.g., ensure input_msgs[0] and output_msgs[0] have a "parts" list, that
parts[0] contains "type" and "content" keys, and that "content" is the expected
string or non-empty value), so the test fails if the required type/content
fields are missing or empty.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1215e26d-911c-4d0f-b60a-791811eb5b38
📒 Files selected for processing (2)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.pypackages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py
...entation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py
Outdated
Show resolved
Hide resolved
…n unknown Updated output message construction to include finish_reason when available and adjusted handling for both non-streaming and streaming responses.
… in messages Add assertions to validate structure of input and output messages in tests.
|
Addressed both CodeRabbit review items in the latest two commits:
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py (1)
49-63: Assert that the deprecated indexed attributes are gone.These checks only prove the new JSON blobs exist. They would still pass if
gen_ai.prompt.*/gen_ai.completion.*were emitted alongside them, which leaves the core migration behavior unguarded.🧪 Minimal regression guard
assert GenAIAttributes.GEN_AI_INPUT_MESSAGES in attrs + assert not any(key.startswith("gen_ai.prompt.") for key in attrs.keys()) input_msgs = json.loads(attrs[GenAIAttributes.GEN_AI_INPUT_MESSAGES]) assert GenAIAttributes.GEN_AI_OUTPUT_MESSAGES in attrs + assert not any(key.startswith("gen_ai.completion.") for key in attrs.keys()) output_msgs = json.loads(attrs[GenAIAttributes.GEN_AI_OUTPUT_MESSAGES])🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py` around lines 49 - 63, Add assertions to ensure deprecated indexed attributes are not present: after loading attrs and verifying GenAIAttributes.GEN_AI_INPUT_MESSAGES and GenAIAttributes.GEN_AI_OUTPUT_MESSAGES, assert that no keys in attrs start with "gen_ai.prompt." or "gen_ai.completion." (e.g., use any(k.startswith("gen_ai.prompt.") or k.startswith("gen_ai.completion.") for k in attrs) and assert that this is False). Reference the attrs variable and the existing GenAIAttributes.GEN_AI_INPUT_MESSAGES / GenAIAttributes.GEN_AI_OUTPUT_MESSAGES checks to locate where to add these assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`:
- Around line 202-218: The current handling of kwargs["contents"] in this block
(and the similar block at 253-303) only handles str vs list and treats every
list item as a separate turn, which misclassifies single Content objects and
iterables of Part objects; change the normalization so that you first detect
whether contents is a Content-like object (has attributes role and parts) and
treat it as one turn, detect whether contents is a single Part (has attributes
type/content) and wrap it into a single turn with default role "user", and
detect whether contents is an iterable: if the iterable's items are Content
objects treat each as a separate turn, but if the iterable's items are Part
objects aggregate them into one turn. Use the existing _process_content_item to
process parts, preserve role via getattr(content_item, "role", "user"), and
append consistent dicts to input_messages with keys "role" and "parts"; apply
the same normalization logic in both the block around input_messages
construction and the similar code at lines 253-303.
---
Nitpick comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py`:
- Around line 49-63: Add assertions to ensure deprecated indexed attributes are
not present: after loading attrs and verifying
GenAIAttributes.GEN_AI_INPUT_MESSAGES and
GenAIAttributes.GEN_AI_OUTPUT_MESSAGES, assert that no keys in attrs start with
"gen_ai.prompt." or "gen_ai.completion." (e.g., use
any(k.startswith("gen_ai.prompt.") or k.startswith("gen_ai.completion.") for k
in attrs) and assert that this is False). Reference the attrs variable and the
existing GenAIAttributes.GEN_AI_INPUT_MESSAGES /
GenAIAttributes.GEN_AI_OUTPUT_MESSAGES checks to locate where to add these
assertions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 970a054d-55c6-4934-90f0-ee656e5f647a
📒 Files selected for processing (2)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.pypackages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py
...entation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py
Show resolved
Hide resolved
…bsent
Adds negative assertions to test_client_spans to verify deprecated gen_ai.prompt.{N} and gen_ai.completion.{N} attributes are absent after migrating to gen_ai.input.messages / gen_ai.output.messages. Addresses CodeRabbit review feedback (round 2).Adds negative assertions to test_client_spans to guard against deprecated gen_ai.prompt.{N} and gen_ai.completion.{N} attributes leaking back in after the migration to gen_ai.input.messages / gen_ai.output.messages. Addresses CodeRabbit review feedback.Added assertions to verify deprecated attributes are not emitted.
…tent] from list[Part] in contents normalization Addresses CodeRabbit major review comment on contents normalization: 1. Adds elif hasattr(contents, "parts") branch to handle a single Content object passed directly (not wrapped in a list). 2. In the elif isinstance(contents, list) branch, sniffs contents[0] to distinguish multi-turn conversations (list[Content], each with .parts) from single-turn inputs (list[Part] or list[str]). Multi-turn: each Content becomes a separate message. Single-turn: all Parts are collected into one user message. Applied symmetrically to both the async set_input_attributes and sync set_input_attributes_sync.
|
Addressed all CodeRabbit round-2 feedback (commits 08f2f74 and 56f5e51): test_generate_content.py (commit 08f2f74): Added negative assertions after the existing assert not any(key.startswith("gen_ai.prompt.") for key in attrs.keys())
assert not any(key.startswith("gen_ai.completion.") for key in attrs.keys())span_utils.py (commit 56f5e51): Fixed
The fix applies to both the async |
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (1)
123-138:⚠️ Potential issue | 🟠 MajorNormalize
Partinputs before serializinggen_ai.input.messages.The SDK accepts
contentsas a singlePart, groups non-function-call parts into a user turn, groups function-call parts into a model turn, and allows mixedContentUnionlists to alternate between those grouped turns. This code still only special-casesstr/Content/list, forces non-Contentlists to"user", and falls back tostr(part)for direct Part-like values, so valid tool-calling or multimodal inputs can be dropped or mis-serialized. (googleapis.github.io)Also applies to: 202-239, 274-340
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py` around lines 123 - 138, This block mishandles Part-like inputs by only special-casing str/list and falling back to str(content_item), which can drop or mis-serialize tool-calling or multimodal Parts; update the processing around content_item to first normalize any Part/Part-like or ContentUnion instances into canonical message parts (preserving function-call vs non-function-call semantics and list alternation rules used by gen_ai.input.messages) before serializing: detect Part objects (instead of treating them as generic objects), route image Parts through _is_image_part/_process_image_part as before, group non-function-call Parts into user turns and function-call Parts into model turns, and handle single Part or single-list inputs identically to lists so you never fall back to processed_content.append({"type":"text","content": str(content_item)}); apply the same normalization logic to the other processing sites referenced in the review (the blocks around lines handling content lists) to ensure consistent serialization.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`:
- Around line 445-459: The code currently reads a single finish_reason from
_candidates[0] and reuses it for every message in the multi-message branch;
change logic so each message uses its own candidate's finish_reason: iterate
over _candidates (or ensure the loop enumerates response/candidates) and for
each item use that item.finish_reason (handling hasattr(item.finish_reason,
"name") as done now) to set msg["finish_reason"] before appending to
output_messages; update references to response.text/list branching and the
variables _candidates, fr, and _finish_reason accordingly so each candidate's
finish_reason is applied to its own msg.
- Around line 443-487: The handler only serializes response.text; update the
logic in span_utils.py (the block building output_messages used before calling
_set_span_attribute with GenAIAttributes.GEN_AI_OUTPUT_MESSAGES) to iterate
response.parts (and response.function_calls when present) in addition to
response.text so all part types (function_call, inline_data, file_data,
executable_code, code_execution_result, etc.) are captured into the parts list
structure, preserving existing fields like "type", "content" (or structured
payload for function calls), and adding finish_reason when available from
response.candidates; ensure GenerateContentResponse.parts entries are normalized
into the same message format currently used for text so downstream consumers
receive complete gen_ai.output.messages.
---
Duplicate comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`:
- Around line 123-138: This block mishandles Part-like inputs by only
special-casing str/list and falling back to str(content_item), which can drop or
mis-serialize tool-calling or multimodal Parts; update the processing around
content_item to first normalize any Part/Part-like or ContentUnion instances
into canonical message parts (preserving function-call vs non-function-call
semantics and list alternation rules used by gen_ai.input.messages) before
serializing: detect Part objects (instead of treating them as generic objects),
route image Parts through _is_image_part/_process_image_part as before, group
non-function-call Parts into user turns and function-call Parts into model
turns, and handle single Part or single-list inputs identically to lists so you
never fall back to processed_content.append({"type":"text","content":
str(content_item)}); apply the same normalization logic to the other processing
sites referenced in the review (the blocks around lines handling content lists)
to ensure consistent serialization.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 31c464ec-5fbf-4173-853e-e80ee36d116d
📒 Files selected for processing (2)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.pypackages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py
✅ Files skipped from review due to trivial changes (1)
- packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py
...entation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py
Show resolved
Hide resolved
...entation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py
Outdated
Show resolved
Hide resolved
|
Addressed CodeRabbit round-3 feedback in commit 6dc73dc: Item 1 (Major — Handle all response.parts types): Added a Item 2 (Minor — Per-candidate finish_reason): Removed the single shared |
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (1)
123-139:⚠️ Potential issue | 🟠 MajorStructured inputs are still dropped or flattened outside the main
contentspath.
kwargs["contents"]still only matchesstr, objects with.parts, orlist, so a single part-like object falls through without emitting anything. The list and positional-argsbranches also serialize.text/.partsobjects viastr(...)instead of the existing content/part serializers, which loses real text, roles, and turn boundaries ingen_ai.input.messages. Please funnel both kwargs and positional args through one shared normalizer.Also applies to: 158-189, 202-248, 282-378
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py` around lines 123 - 139, The code currently flattens or drops structured inputs by only handling str, .parts lists, or list types and by using str(...) instead of the proper serializers; update span_utils.py to funnel both kwargs["contents"] and all positional args through a single shared normalizer function (create a helper like _normalize_contents used by the blocks around processed_content) that: detects single part-like objects (use _is_image_part) and other content types, calls the existing serializers for content/part instead of str(), and routes image parts to _process_image_part with span.context.trace_id/span_id; replace the duplicated handling in the branches referenced (including the shown block and the ranges 158-189, 202-248, 282-378) to call this shared normalizer so gen_ai.input.messages preserves text, roles, and turn boundaries.
🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (1)
441-447: Align tool-call parts with the shared message schema.This serializer emits
{"type": "function_call"}, butpackages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py:57-94already uses{"type": "tool_call"}for the same cross-provider part shape. Using a different type here makes downstream parsers provider-specific again.♻️ Suggested change
return { - "type": "function_call", + "type": "tool_call", "name": fc.name, "arguments": dict(fc.args) if hasattr(fc, "args") else {}, }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py` around lines 441 - 447, The serializer currently returns a dict with "type": "function_call" for parts where part.function_call is present; change the emitted type to "tool_call" and keep the same shape (keys "name" and "arguments") so the output matches the cross-provider schema used by the other implementation (see the handling of part.function_call -> fc.name and fc.args in span_utils). Update the return inside the if hasattr(part, "function_call") block to emit "type": "tool_call" and preserve converting fc.args to a dict when present.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`:
- Around line 473-500: The candidate-level fallback that assigns response.text
when parts is empty should be moved out of the per-candidate loop so one
candidate cannot inherit another candidate’s text; instead, only if no candidate
produced any parts at all use the response-wide text to create a single
assistant message. Also fix the second fallback branch to check
isinstance(response.text, list) and iterate over response.text (not response)
when building messages. Update logic around parts, candidates iteration and the
output_messages append (references: parts, candidates loop, response.text,
output_messages, _finish_reason) accordingly.
---
Duplicate comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`:
- Around line 123-139: The code currently flattens or drops structured inputs by
only handling str, .parts lists, or list types and by using str(...) instead of
the proper serializers; update span_utils.py to funnel both kwargs["contents"]
and all positional args through a single shared normalizer function (create a
helper like _normalize_contents used by the blocks around processed_content)
that: detects single part-like objects (use _is_image_part) and other content
types, calls the existing serializers for content/part instead of str(), and
routes image parts to _process_image_part with span.context.trace_id/span_id;
replace the duplicated handling in the branches referenced (including the shown
block and the ranges 158-189, 202-248, 282-378) to call this shared normalizer
so gen_ai.input.messages preserves text, roles, and turn boundaries.
---
Nitpick comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`:
- Around line 441-447: The serializer currently returns a dict with "type":
"function_call" for parts where part.function_call is present; change the
emitted type to "tool_call" and keep the same shape (keys "name" and
"arguments") so the output matches the cross-provider schema used by the other
implementation (see the handling of part.function_call -> fc.name and fc.args in
span_utils). Update the return inside the if hasattr(part, "function_call")
block to emit "type": "tool_call" and preserve converting fc.args to a dict when
present.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8f62549c-6d10-4524-98d5-1d4a7bc278d8
📒 Files selected for processing (1)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py
| if not parts: | ||
| # Fallback: try response.text for simple single-part responses | ||
| try: | ||
| text = response.text | ||
| if text: | ||
| parts = [{"type": "text", "content": text}] | ||
| except Exception: | ||
| pass | ||
| if parts: | ||
| msg = {"role": "assistant", "parts": parts} | ||
| if _finish_reason: | ||
| msg["finish_reason"] = _finish_reason | ||
| output_messages.append(msg) | ||
| else: | ||
| # No candidates field: fall back to response.text | ||
| try: | ||
| if isinstance(response.text, list): | ||
| for item in response: | ||
| output_messages.append({ | ||
| "role": "assistant", | ||
| "parts": [{"type": "text", "content": item.text}], | ||
| }) | ||
| elif isinstance(response.text, str): | ||
| output_messages.append({ | ||
| "role": "assistant", | ||
| "parts": [{"type": "text", "content": response.text}], | ||
| }) | ||
| except Exception: |
There was a problem hiding this comment.
Move the response.text fallback out of the candidate loop.
The current fallback is response-wide, so any candidate with empty content.parts can inherit another candidate’s text. The second fallback also checks isinstance(response.text, list) but iterates response, which will quietly drop output when the response object itself is not iterable.
🛠️ Suggested fix
- if not parts:
+ if not parts and len(_candidates) == 1:
# Fallback: try response.text for simple single-part responses
try:
text = response.text
if text:
parts = [{"type": "text", "content": text}]
except Exception:
pass- if isinstance(response.text, list):
- for item in response:
+ if isinstance(response.text, list):
+ for item in response.text:
output_messages.append({
"role": "assistant",
- "parts": [{"type": "text", "content": item.text}],
+ "parts": [{"type": "text", "content": getattr(item, "text", item)}],
})🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`
around lines 473 - 500, The candidate-level fallback that assigns response.text
when parts is empty should be moved out of the per-candidate loop so one
candidate cannot inherit another candidate’s text; instead, only if no candidate
produced any parts at all use the response-wide text to create a single
assistant message. Also fix the second fallback branch to check
isinstance(response.text, list) and iterate over response.text (not response)
when building messages. Update logic around parts, candidates iteration and the
output_messages append (references: parts, candidates loop, response.text,
output_messages, _finish_reason) accordingly.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py (1)
167-168:⚠️ Potential issue | 🔴 CriticalCritical:
MagicMockused but not imported.The test uses
MagicMock()on line 167, but there's no import for it. This will causeNameError: name 'MagicMock' is not definedwhen running this test.🐛 Proposed fix - add import
import json import pytest +from unittest.mock import MagicMock from opentelemetry.trace import StatusCode, SpanKind🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py` around lines 167 - 168, The test uses MagicMock (see span = MagicMock() in tests/test_generate_content.py) but MagicMock is not imported; add an import for MagicMock (e.g., from unittest.mock import MagicMock) at the top of the test module so the span = MagicMock() and span.is_recording.return_value = True lines can run without NameError.packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (2)
720-735:⚠️ Potential issue | 🔴 CriticalCritical: Variable
umused but never assigned, causingNameError.The code checks
hasattr(response, "usage_metadata")but never assignsum. Lines 724, 729, and 734 referenceum.total_token_count,um.candidates_token_count, andum.prompt_token_countwhich will fail at runtime.🐛 Proposed fix
_set_span_attribute(span, GenAIAttributes.GEN_AI_RESPONSE_MODEL, llm_model) if hasattr(response, "usage_metadata"): + um = response.usage_metadata _set_span_attribute( span, SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS, um.total_token_count, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py` around lines 720 - 735, The code references an unassigned variable um when reading usage fields; fix by reading the response.usage_metadata into a local variable (e.g., um = response.usage_metadata) immediately after the hasattr(response, "usage_metadata") check, guard for None, then call _set_span_attribute using um.total_token_count, um.candidates_token_count, and um.prompt_token_count; update the block around response, um, and _set_span_attribute (and leave SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS / GenAIAttributes.GEN_AI_USAGE_OUTPUT_TOKENS / GenAIAttributes.GEN_AI_USAGE_INPUT_TOKENS unchanged) so it no longer raises NameError.
740-741:⚠️ Potential issue | 🔴 CriticalAdd missing imports for
_GCP_GEN_AIand_GEN_CONTENTconstants.These constants are used at lines 740-741 in
span_utils.pybut are not imported. They are defined in__init__.pyand should be imported from there to avoidNameErrorat runtime.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py` around lines 740 - 741, Import the missing constants _GCP_GEN_AI and _GEN_CONTENT into span_utils.py so they are available where used in the attributes dict (GenAIAttributes.GEN_AI_PROVIDER_NAME: _GCP_GEN_AI, GenAIAttributes.GEN_AI_OPERATION_NAME: _GEN_CONTENT); add them to the module-level imports by importing _GCP_GEN_AI and _GEN_CONTENT from the package root (where they are defined in __init__.py) alongside the other constants to prevent NameError at runtime.
♻️ Duplicate comments (2)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py (2)
663-670:⚠️ Potential issue | 🟡 MinorResponse-wide
response.textfallback inside per-candidate loop may cause incorrect attribution.The fallback to
response.text(lines 665-668) is inside the per-candidate loop. For multi-candidate responses, if any candidate has emptycontent.parts, it will use the response-wideresponse.textwhich typically reflects only the first candidate's text. Consider moving this fallback outside the loop or limiting it to single-candidate responses.🛡️ Suggested guard
if not parts: - # Fallback: try response.text for simple single-part responses - try: - text = response.text - if text: - parts = [{"type": "text", "content": text}] - except Exception: - pass + # Fallback only for single-candidate responses + if len(_candidates) == 1: + try: + text = response.text + if text: + parts = [{"type": "text", "content": text}] + except Exception: + pass🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py` around lines 663 - 670, The per-candidate fallback that sets parts from response.text should not run inside the per-candidate loop because response.text typically reflects the first candidate and will misattribute text to other candidates; update the logic in span_utils.py so that you only use response.text when the response has a single candidate (e.g., check len(response.candidates) == 1) or move the response.text fallback out of the candidate-processing loop and apply it only when no candidates produced any parts; adjust the branch that currently checks "if not parts:" inside the candidate loop to instead use a guarded check (single-candidate) or perform a post-loop fallback using response.text.
676-691:⚠️ Potential issue | 🟠 MajorFallback branch iterates
responseinstead ofresponse.text.When
response.textis a list (line 679), the code iterates overresponse(line 680) rather thanresponse.text. Additionally, accessingitem.textassumes items are objects with atextattribute, which may not be true ifresponse.textcontains strings.🐛 Proposed fix
else: # No candidates field: fall back to response.text try: if isinstance(response.text, list): - for item in response: + for item in response.text: output_messages.append({ "role": "assistant", - "parts": [{"type": "text", "content": item.text}], + "parts": [{"type": "text", "content": getattr(item, "text", str(item))}], }) elif isinstance(response.text, str):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py` around lines 676 - 691, In the else fallback in span_utils.py (the block handling response.text when no candidates field exists), fix the loop to iterate over response.text (not response) and handle list item types safely: for each item in response.text, if the item is a string use it directly as the "content", otherwise if it has a .text attribute use that, and skip or stringify other types; update the branch inside the function that builds output_messages so it checks isinstance(response.text, list) and accesses items correctly instead of using item.text blindly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`:
- Around line 238-261: The function _process_content_item uses an undefined
accumulator and helper: replace the erroneous parts_acc usage with the local
processed_content (i.e., call processed_content.extend(...)) and ensure the
helper _parts_from_genai_part_async is provided — either implement/import a
coroutine named _parts_from_genai_part_async(part, span, part_index) that
returns a list of processed parts or rename the call to the correct existing
async helper if it already exists; update _process_content_item to extend
processed_content with the awaited result and return processed_content.
In
`@packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py`:
- Around line 76-78: The test asserts SpanAttributes.LLM_USAGE_TOTAL_TOKENS
while the implementation sets SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS, causing
a mismatch; fix by making them consistent: either update the test to assert
SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS (and related GEN_AI_* attributes) or
update the implementation (the code that assigns token attributes in
span_utils.py) to also set the LLM_USAGE_* attributes (or set both keys) so both
constants are present; ensure you update or add the corresponding assertions for
input/output/total tokens (SpanAttributes.LLM_USAGE_TOTAL_TOKENS,
GenAIAttributes.GEN_AI_USAGE_INPUT_TOKENS,
GenAIAttributes.GEN_AI_USAGE_OUTPUT_TOKENS) consistently.
---
Outside diff comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`:
- Around line 720-735: The code references an unassigned variable um when
reading usage fields; fix by reading the response.usage_metadata into a local
variable (e.g., um = response.usage_metadata) immediately after the
hasattr(response, "usage_metadata") check, guard for None, then call
_set_span_attribute using um.total_token_count, um.candidates_token_count, and
um.prompt_token_count; update the block around response, um, and
_set_span_attribute (and leave SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS /
GenAIAttributes.GEN_AI_USAGE_OUTPUT_TOKENS /
GenAIAttributes.GEN_AI_USAGE_INPUT_TOKENS unchanged) so it no longer raises
NameError.
- Around line 740-741: Import the missing constants _GCP_GEN_AI and _GEN_CONTENT
into span_utils.py so they are available where used in the attributes dict
(GenAIAttributes.GEN_AI_PROVIDER_NAME: _GCP_GEN_AI,
GenAIAttributes.GEN_AI_OPERATION_NAME: _GEN_CONTENT); add them to the
module-level imports by importing _GCP_GEN_AI and _GEN_CONTENT from the package
root (where they are defined in __init__.py) alongside the other constants to
prevent NameError at runtime.
In
`@packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py`:
- Around line 167-168: The test uses MagicMock (see span = MagicMock() in
tests/test_generate_content.py) but MagicMock is not imported; add an import for
MagicMock (e.g., from unittest.mock import MagicMock) at the top of the test
module so the span = MagicMock() and span.is_recording.return_value = True lines
can run without NameError.
---
Duplicate comments:
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`:
- Around line 663-670: The per-candidate fallback that sets parts from
response.text should not run inside the per-candidate loop because response.text
typically reflects the first candidate and will misattribute text to other
candidates; update the logic in span_utils.py so that you only use response.text
when the response has a single candidate (e.g., check len(response.candidates)
== 1) or move the response.text fallback out of the candidate-processing loop
and apply it only when no candidates produced any parts; adjust the branch that
currently checks "if not parts:" inside the candidate loop to instead use a
guarded check (single-candidate) or perform a post-loop fallback using
response.text.
- Around line 676-691: In the else fallback in span_utils.py (the block handling
response.text when no candidates field exists), fix the loop to iterate over
response.text (not response) and handle list item types safely: for each item in
response.text, if the item is a string use it directly as the "content",
otherwise if it has a .text attribute use that, and skip or stringify other
types; update the branch inside the function that builds output_messages so it
checks isinstance(response.text, list) and accesses items correctly instead of
using item.text blindly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: eadafac1-a58b-43fc-9f8c-5c6cfeb6ada0
📒 Files selected for processing (2)
packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.pypackages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py
| async def _process_content_item(content_item, span): | ||
| parts_acc = [] | ||
| if isinstance(content_item, dict): | ||
| for p in content_item.get("parts", []): | ||
| if isinstance(p, dict): | ||
| parts_acc.append(_map_dict_part(p)) | ||
| else: | ||
| parts_acc.extend(await _parts_from_genai_part_async(p, span, 0)) | ||
| elif hasattr(content_item, "parts"): | ||
| """Process a single content item, handling different types (Content objects, strings, Parts)""" | ||
| processed_content = [] | ||
| if hasattr(content_item, "parts"): | ||
| # Content with parts (Google GenAI Content object) | ||
| for part_index, part in enumerate(content_item.parts): | ||
| parts_acc.extend(await _parts_from_genai_part_async(part, span, part_index)) | ||
| elif isinstance(content_item, str): | ||
| parts_acc.append({"type": "text", "content": content_item}) | ||
| # Direct string in the list | ||
| processed_content.append({"type": "text", "content": content_item}) | ||
| elif _is_image_part(content_item): | ||
| img = await _otel_image_part_from_genai_part_async(content_item, span, 0) | ||
| if img: | ||
| parts_acc.append(img) | ||
| # Direct Part object that's an image | ||
| processed_image = await _process_image_part( | ||
| content_item, | ||
| span.context.trace_id, | ||
| span.context.span_id, | ||
| 0 | ||
| ) | ||
| if processed_image is not None: | ||
| processed_content.append(processed_image) | ||
| else: | ||
| parts_acc.append({"type": "text", "content": str(content_item)}) | ||
| return parts_acc | ||
| # Other content types | ||
| processed_content.append({"type": "text", "content": str(content_item)}) | ||
| return processed_content |
There was a problem hiding this comment.
Critical: Undefined variable parts_acc and function _parts_from_genai_part_async will cause runtime errors.
The static analysis correctly identifies two undefined names:
- Line 244 uses
parts_accbut the function initializesprocessed_content _parts_from_genai_part_asyncis called but never defined in this module
This will raise NameError at runtime when processing Content objects with parts.
🐛 Proposed fix
async def _process_content_item(content_item, span):
"""Process a single content item, handling different types (Content objects, strings, Parts)"""
processed_content = []
if hasattr(content_item, "parts"):
# Content with parts (Google GenAI Content object)
for part_index, part in enumerate(content_item.parts):
- parts_acc.extend(await _parts_from_genai_part_async(part, span, part_index))
+ processed_part = await _process_content_part(part, span, part_index)
+ if processed_part:
+ processed_content.append(processed_part)
elif isinstance(content_item, str):🧰 Tools
🪛 Ruff (0.15.9)
[error] 244-244: Undefined name parts_acc
(F821)
[error] 244-244: Undefined name _parts_from_genai_part_async
(F821)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@packages/opentelemetry-instrumentation-google-generativeai/opentelemetry/instrumentation/google_generativeai/span_utils.py`
around lines 238 - 261, The function _process_content_item uses an undefined
accumulator and helper: replace the erroneous parts_acc usage with the local
processed_content (i.e., call processed_content.extend(...)) and ensure the
helper _parts_from_genai_part_async is provided — either implement/import a
coroutine named _parts_from_genai_part_async(part, span, part_index) that
returns a list of processed parts or rename the call to the correct existing
async helper if it already exists; update _process_content_item to extend
processed_content with the awaited result and return processed_content.
| assert attrs[SpanAttributes.LLM_USAGE_TOTAL_TOKENS] > 0 | ||
| assert attrs[GenAIAttributes.GEN_AI_USAGE_INPUT_TOKENS] > 0 | ||
| assert attrs[GenAIAttributes.GEN_AI_USAGE_OUTPUT_TOKENS] > 0 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check the SpanAttributes definitions for total tokens attribute
rg -n 'TOTAL_TOKENS|total_tokens' packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.pyRepository: traceloop/openllmetry
Length of output: 185
Resolve inconsistent token attribute usage between test and implementation.
The test at lines 76-78 asserts SpanAttributes.LLM_USAGE_TOTAL_TOKENS, but span_utils.py line 723 sets SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS. These are not aliases—they are two distinct attributes with different underlying values ("llm.usage.total_tokens" vs "gen_ai.usage.total_tokens"). Ensure the test and implementation use the same attribute, or verify if both should be set.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@packages/opentelemetry-instrumentation-google-generativeai/tests/test_generate_content.py`
around lines 76 - 78, The test asserts SpanAttributes.LLM_USAGE_TOTAL_TOKENS
while the implementation sets SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS, causing
a mismatch; fix by making them consistent: either update the test to assert
SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS (and related GEN_AI_* attributes) or
update the implementation (the code that assigns token attributes in
span_utils.py) to also set the LLM_USAGE_* attributes (or set both keys) so both
constants are present; ensure you update or add the corresponding assertions for
input/output/total tokens (SpanAttributes.LLM_USAGE_TOTAL_TOKENS,
GenAIAttributes.GEN_AI_USAGE_INPUT_TOKENS,
GenAIAttributes.GEN_AI_USAGE_OUTPUT_TOKENS) consistently.
Summary
Fixes #3515.
Migrates
google_generativeai/span_utils.pyfrom deprecated indexed span attributes to the stable OTel GenAI Semantic Conventions:gen_ai.prompt.{N}.content/gen_ai.prompt.{N}.role->gen_ai.input.messages(JSON array)gen_ai.completion.{N}.content/gen_ai.completion.{N}.role->gen_ai.output.messages(JSON array)The new attributes store a JSON-serialised list of
{"role": "...", "parts": [...]}objects, consistent with the format used by the OpenAI instrumentation in this repo.Changes
span_utils.py: Removed_set_prompt_attributes;set_input_attributesandset_input_attributes_syncnow build aninput_messageslist and write it togen_ai.input.messages;set_response_attributesbuilds anoutput_messageslist and writes it togen_ai.output.messages.tests/test_generate_content.py: Updated assertions to parse the new JSON blob attributes instead of the old indexed ones.Checklist
fix(instrumentation): ...Summary by CodeRabbit
Refactor
Tests