Skip to content

feat: add metadata field to messages for stateful context tracking#2125

Open
lizradway wants to merge 2 commits intostrands-agents:mainfrom
lizradway:metadata
Open

feat: add metadata field to messages for stateful context tracking#2125
lizradway wants to merge 2 commits intostrands-agents:mainfrom
lizradway:metadata

Conversation

@lizradway
Copy link
Copy Markdown
Member

@lizradway lizradway commented Apr 14, 2026

Description

Adds an optional metadata field to the Message TypedDict that carries per-message usage, metrics, and arbitrary custom data from model responses. This is a foundational piece for the context management roadmap — downstream features like proactive compression (#555), smart truncation, and per-message cost analysis need this information attached directly to messages.

What it does:

  • Adds MessageMetadata TypedDict and metadata: NotRequired[MessageMetadata] on Message
  • Populates metadata on assistant messages immediately after stream processing, before AfterModelCallEvent fires (so all hook consumers see consistent state)
  • Whitelists only role and content before model calls — metadata (and any future non-model fields) never leak to providers
  • Replaces Message.__annotations__.keys() with Message.__required_keys__ in agent.py for message detection, so the optional metadata field doesn't affect input type inference
  • Adds get_message_metadata() convenience accessor

Note: The existing test test_event_loop_cycle_tool_result previously asserted 4 messages in the model.stream call, but the 4th message was the response FROM the model, not input TO it. This was a pre-existing bug in the test that was corrected as part of this change.

Related Issues

Closes #1532

Documentation PR

N/A — no new public API surface; metadata is framework-internal.

Type of Change

New feature

Testing

How have you tested the change?

  • Added test_event_loop_metadata.py with 3 tests: metadata population with real usage, default values when no metadata event, and verification that metadata is stripped before model calls
  • Added test_message_metadata.py with 4 unit tests for the MessageMetadata type and get_message_metadata accessor
  • Updated existing test assertions across 7 files to account for metadata presence on assistant messages
  • All 1569 tests pass
  • I ran hatch run prepare

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

…trands-agents#1532)

Attach usage and metrics from model responses directly to assistant
messages, enabling downstream features like smart truncation and
per-message cost analysis.
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 14, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@github-actions

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

@github-actions

This comment was marked as resolved.

@github-actions

This comment was marked as off-topic.

@lizradway lizradway marked this pull request as ready for review April 14, 2026 20:10
@mkmeral
Copy link
Copy Markdown
Contributor

mkmeral commented Apr 14, 2026

/strands review

Also tell me about how this impacts serialization/deserialization of messages, and if it can impact session managers.

@github-actions
Copy link
Copy Markdown

Assessment: Comment

All issues from previous review rounds have been addressed. The PR checklist is now fully checked, and the code is in good shape.

Session Manager Serialization/Deserialization Impact

Per @mkmeral's request, here is the serialization analysis:

Save path: SessionMessage.from_message() stores the full Message dict (including metadata). to_dict() uses dataclasses.asdict()encode_bytes_values(), which recursively processes dicts/lists. Metadata contains only JSON-primitive types (int, str, dict), so it round-trips cleanly through JSON for both FileSessionManager and S3SessionManager.

Load path: SessionMessage.from_dict()decode_bytes_values() reconstructs the message dict. to_message() returns the raw dict. Metadata is preserved end-to-end.

Backward compatibility: Old sessions persisted without metadata load correctly — from_dict() produces a message without the key, and get_message_metadata() returns {}. No schema migration needed.

Conversation managers: SlidingWindowConversationManager trims by slicing the messages list and mutates only message["content"] during tool result truncation — metadata is untouched. SummarizingConversationManager replaces summarized messages with a new summary message (no metadata on it, which is correct since it's a synthetic user message). Retained messages keep their metadata.

One gap: SummarizingConversationManager._generate_summary_with_model calls model.stream() directly, bypassing the streaming.py metadata stripping. Currently safe because all providers construct fresh dicts, but it's a defense-in-depth gap. See inline comment.

Previously Raised Items
  • The whitelist approach in streaming.py (Message(role=..., content=...)) was raised in rounds 1 and 2. It's a robustness suggestion, not a blocker.
  • All critical/important items from prior rounds are resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Add metadata field to messages for stateful context tracking

2 participants