fix: preserve output type in FunctionSpanData.export()#2871
fix: preserve output type in FunctionSpanData.export()#2871lawrence3699 wants to merge 1 commit intoopenai:mainfrom
Conversation
FunctionSpanData.export() applied str(self.output) if self.output else None, which corrupted outputs in two ways: 1. Dict/list outputs were converted to Python repr strings (single quotes, Python booleans) instead of remaining as dicts that json.dumps can serialize correctly. 2. Falsy but valid outputs (0, False, empty string, empty list) were silently replaced with None. This is inconsistent with GenerationSpanData.export(), which passes self.output through directly. The fix removes the str() conversion to match the sibling pattern. Updated MCP tracing snapshots accordingly and added unit tests covering dict, string, None, falsy, list, and numeric output values.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b1041c2acd
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| "name": self.name, | ||
| "input": self.input, | ||
| "output": str(self.output) if self.output else None, | ||
| "output": self.output, |
There was a problem hiding this comment.
Preserve JSON-safe fallback for function span outputs
Passing self.output through unmodified can break trace export when a tool returns a non-JSON-serializable object (for example a custom class or datetime) and the BackendSpanExporter is configured with a non-default endpoint. In that configuration _should_sanitize_for_openai_tracing_api() is false, so httpx.Client.post(..., json=payload) receives raw span data and raises during JSON encoding, whereas the previous str(...) conversion kept function span payloads serializable. This makes tracing brittle for custom exporters even though the same tool output still works for run execution.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
Fixes FunctionSpanData.export() so tool outputs are exported without being coerced to str(...) or dropped when falsy, aligning behavior with GenerationSpanData.export() and correcting MCP tracing snapshots.
Changes:
- Update
FunctionSpanData.export()to pass throughoutputdirectly (preserving dict/list and falsy values). - Add unit tests covering various output shapes and falsy values for
FunctionSpanData.export(). - Update MCP tracing test expectations to assert structured dict outputs instead of Python repr strings.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
src/agents/tracing/span_data.py |
Changes function span export to preserve output values/types. |
tests/tracing/test_span_data.py |
Adds unit tests validating output preservation across types, including falsy values. |
tests/mcp/test_mcp_tracing.py |
Updates snapshot assertions to expect structured dict tool outputs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "type": self.type, | ||
| "name": self.name, | ||
| "input": self.input, | ||
| "output": str(self.output) if self.output else None, | ||
| "output": self.output, | ||
| "mcp_data": self.mcp_data, | ||
| } |
There was a problem hiding this comment.
FunctionSpanData.export() now returns self.output verbatim, but Trace.export()/BackendSpanExporter expect the exported payload to be JSON-serializable. Tool results can be arbitrary Python objects (including Pydantic models like ToolOutputText), which will make httpx.post(..., json=payload) raise TypeError for non-OpenAI endpoints (and even for the OpenAI endpoint the sanitizer will degrade such values to a generic "<... truncated>" preview). Consider normalizing output to a JSON-compatible value (e.g., pass through dict/list/str/number/bool/None; convert Pydantic models via model_dump(mode="json"), dataclasses via asdict, tuples/sets via list; otherwise fall back to str(output)), while still preserving falsy values like 0/False/""/[].
Summary
FunctionSpanData.export()appliesstr(self.output) if self.output else None, which corrupts traced tool outputs in two ways:Dict/list outputs become Python repr strings. A tool returning
{"status": "ok"}gets exported as the string"{'status': 'ok'}"— single quotes, Python-style booleans — instead of staying as a dict thatjson.dumpscan serialize correctly. This means the tracing ingest API receives malformed data.Falsy but valid outputs are dropped. A tool returning
0,False,"", or[]gets exported asNone, losing the actual result.The sibling class
GenerationSpanData.export()passesself.outputthrough directly without conversion. This fix alignsFunctionSpanDatawith that pattern.Test plan
tests/tracing/test_span_data.pywith 10 unit tests covering dict, string, None, falsy (0, False, empty string, empty list), list, and numeric outputs.make format,make lint,make typecheckall clean.Issue number
N/A — found via code inspection; no existing issue.
Checks
make lintandmake format