Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
95 commits
Select commit Hold shift + click to select a range
7dd9f44
feat: #636 Add human-in-the-loop (HITL) support to the SDK
seratch Dec 23, 2025
5c27fe5
split _run_impl.py into run_internal/
seratch Dec 25, 2025
8458be4
Simplify the run state data
seratch Jan 7, 2026
2695d88
Fix reported issues
seratch Jan 9, 2026
4a6e58b
Fix HITL resume for computer actions and avoid duplicate rejections
seratch Jan 10, 2026
a2012f5
Fix automatic Responses compaction trigger during session persistence
seratch Jan 10, 2026
54e8bb3
Preserve guardrail history when resuming from RunState
seratch Jan 10, 2026
8f56de0
fix python 3.9 errors
seratch Jan 10, 2026
cdc4ead
Preserve conversation tracking when resuming runs
seratch Jan 10, 2026
1d62b7d
Include CompactionItem in RunItem union for type safety
seratch Jan 10, 2026
2c752ad
fix: skip re-executing function tools on HITL resume when outputs exist
seratch Jan 11, 2026
e434a1d
fix: deserialize compaction_item in RunState
seratch Jan 11, 2026
79aface
fix: scope tool rejection to call ids
seratch Jan 11, 2026
a85b746
fix: keep approved tool outputs on HITL resume with pending approvals
seratch Jan 11, 2026
72a296b
feat: add RunState context serialization hooks and metadata
seratch Jan 11, 2026
fba4d45
fix: rebuild HITL function runs from object approvals
seratch Jan 11, 2026
fb9dc20
fix: normalize tool-call dedupe keys for unhashable arguments
seratch Jan 11, 2026
c47471a
fix: harden HITL resume dedupe and nested tool approvals
seratch Jan 11, 2026
c922cf6
fix: make RunState serialization tolerant of non-JSON outputs
seratch Jan 11, 2026
8119aed
feat: add HITL session scenario example and tests
seratch Jan 11, 2026
0ae60f5
fix: surface needs_approval errors on HITL resume
seratch Jan 11, 2026
05c73d3
fix: dedupe tool calls by call_id or id and centralize MCP approval p…
seratch Jan 11, 2026
83b22c1
Centralize tool call deduplication logic
seratch Jan 11, 2026
1e43729
fix: honor filtered inputs in conversation tracking and preserve dupl…
seratch Jan 12, 2026
9f530b1
refactor; add comments
seratch Jan 12, 2026
66565bc
fix: ignore fake response ids in dedupe and strip provider_data for O…
seratch Jan 12, 2026
7d5ddb1
Fix session persistence counter handling and add regression test
seratch Jan 12, 2026
c156161
fix: ignore fake response ids in conversation tracking
seratch Jan 12, 2026
aba9c3f
fix: align OpenAI conversation persistence counts with sanitized items
seratch Jan 12, 2026
a52d29a
Refine run loop cleanup and streaming retry removal
seratch Jan 13, 2026
a6244e3
fix: align streaming input tracking, session persistence count, and t…
seratch Jan 13, 2026
67e8157
fix: honor max_output_length in run_internal shell tools
seratch Jan 16, 2026
74b18c5
merge main branch changes
seratch Jan 19, 2026
a22f439
fix mcp server issue
seratch Jan 19, 2026
1b27b30
fix: avoid duplicate server-managed history and preserve missing exit…
seratch Jan 19, 2026
6cd541d
Fix resume execution for non-approval tool calls
seratch Jan 19, 2026
fa3ffc9
Reset resume state after HITL run-again turn
seratch Jan 19, 2026
d9c4f14
fix: honor context overrides when resuming RunState
seratch Jan 19, 2026
9514213
refactor: move agent tool run result state into agent_tool_state
seratch Jan 19, 2026
db560aa
Isolate nested agent approvals from parent context
seratch Jan 19, 2026
b201ce2
fix: drop callId fallback and skip None tool items
seratch Jan 19, 2026
a856264
Preserve Agent.as_tool positional argument order
seratch Jan 19, 2026
db17316
Fix resumption model input and streamed turns
seratch Jan 19, 2026
19b0140
fix: skip ToolApprovalItem in handoff history and preserve exit_code=0
seratch Jan 19, 2026
b9cbf58
fix: remove exitCode handling
seratch Jan 19, 2026
dc5121c
Fix HITL resume handling for nested approvals
seratch Jan 19, 2026
037fbb1
Fix Usage helper definition order
seratch Jan 19, 2026
3e34102
Update run state agent on handoff
seratch Jan 19, 2026
cdfd321
test: cover rejected nested approvals in agent tool
seratch Jan 19, 2026
4fe64f3
Harden ToolApprovalItem name/arguments access
seratch Jan 19, 2026
dd80d25
Fix tool use tracking by agent instance
seratch Jan 19, 2026
1d5be2a
Persist resumed streaming tool outputs
seratch Jan 19, 2026
7033512
Fix nested agent rejection resume
seratch Jan 19, 2026
c9eb137
Fix HITL session persistence on resume
seratch Jan 19, 2026
cc35786
fix: preserve agent-tool approvals on RunState restore
seratch Jan 19, 2026
39998ac
Refactor HITL run flow helpers
seratch Jan 19, 2026
83b34ec
fix
seratch Jan 20, 2026
1fc2f2f
fix: preserve streaming handoff input on resume
seratch Jan 20, 2026
cae1bcd
fix: align session input handling and handoff tests
seratch Jan 20, 2026
0e98c0f
Preserve pending approvals on resume
seratch Jan 20, 2026
f79a6d0
Fix HITL resume turn tracking
seratch Jan 20, 2026
c551cd6
Fix MCP approval normalization and preserve resumed session history
seratch Jan 20, 2026
29f49f5
Fix resume response duplication and session persistence
seratch Jan 20, 2026
a477cbe
Serialize non-dict original input items in RunState
seratch Jan 20, 2026
9e24b29
Resume nested agent tool after HITL rejection
seratch Jan 20, 2026
b015fc7
Serialize hosted MCP tools for approval persistence
seratch Jan 20, 2026
aa4e798
fix: make HITL examples auto-mode friendly
seratch Jan 20, 2026
2099c0f
Fix tool call dedupe identity
seratch Jan 20, 2026
9f47c51
refactor
seratch Jan 20, 2026
958d793
Fix approval decision precedence for tool calls
seratch Jan 20, 2026
9d5eae7
refactor
seratch Jan 20, 2026
92279aa
feat: persist trace metadata in RunState with opt-in tracing key
seratch Jan 20, 2026
4ff19a7
Handle compaction outputs in turn resolution
seratch Jan 21, 2026
e5284b4
Fix resume tool call dedupe
seratch Jan 21, 2026
9b35548
Fix agent tool run result leak on non-interrupted runs
seratch Jan 21, 2026
14eb8fc
Fix nested agent tool run lookup for call_id collisions
seratch Jan 21, 2026
af28f4b
Preserve None run context in RunState serialization
seratch Jan 21, 2026
8d5e08e
Avoid collapsing tool approvals by call id
seratch Jan 21, 2026
e38abea
Honor needs_approval before stored realtime approvals
seratch Jan 21, 2026
de39da7
fix bugs
seratch Jan 21, 2026
e78dfd3
update agents.md
seratch Jan 21, 2026
957b374
fix bugs; add tests
seratch Jan 21, 2026
1555f5f
fix: prevent leaked agent-tool run results
seratch Jan 21, 2026
9a965cf
add more tests
seratch Jan 21, 2026
e629f29
add tests
seratch Jan 21, 2026
5b2e00f
Fix streaming guardrail persistence for session input and add test
seratch Jan 21, 2026
c2525fa
Align streaming input filtering order
seratch Jan 21, 2026
a655798
Fix hosted MCP approval id handling
seratch Jan 21, 2026
ac4490c
Fix streamed handoff RunState agent update
seratch Jan 22, 2026
3136785
Use updated input for interruption results
seratch Jan 22, 2026
992494f
Apply partial nested approvals for agent tools
seratch Jan 22, 2026
e44a89f
Align hosted MCP approval call IDs
seratch Jan 22, 2026
a45d4b5
fix: apply session settings in run input prep
seratch Jan 28, 2026
3527988
fix: allow literal always/never tool names
seratch Jan 28, 2026
9033d3c
Merge branch 'main' into issue-636-hitl-2
seratch Jan 29, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -149,3 +149,5 @@ cython_debug/

# Redis database files
dump.rdb

tmp/
18 changes: 18 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,24 @@ The OpenAI Agents Python repository provides the Python Agents SDK, examples, an
- `.github/PULL_REQUEST_TEMPLATE/pull_request_template.md`: Pull request template to use when opening PRs.
- `site/`: Built documentation output.

### Agents Core Runtime Guidelines

- `src/agents/run.py` is the runtime entrypoint (`Runner`, `AgentRunner`). Keep it focused on orchestration and public flow control. Put new runtime logic under `src/agents/run_internal/` and import it into `run.py`.
- When `run.py` grows, refactor helpers into `run_internal/` modules (for example `run_loop.py`, `turn_resolution.py`, `tool_execution.py`, `session_persistence.py`) and leave only wiring and composition in `run.py`.
- Keep streaming and non-streaming paths behaviorally aligned. Changes to `run_internal/run_loop.py` (`run_single_turn`, `run_single_turn_streamed`, `get_new_response`, `start_streaming`) should be mirrored, and any new streaming item types must be reflected in `src/agents/stream_events.py`.
- Input guardrails run only on the first turn and only for the starting agent. Resuming an interruption from `RunState` must not increment the turn counter; only actual model calls advance turns.
- Server-managed conversation (`conversation_id`, `previous_response_id`, `auto_previous_response_id`) uses `OpenAIServerConversationTracker` in `run_internal/oai_conversation.py`. Only deltas should be sent. If `call_model_input_filter` is used, it must return `ModelInputData` with a list input and the tracker must be updated with the filtered input (`mark_input_as_sent`). Session persistence is disabled when server-managed conversation is active.
- Adding new tool/output/approval item types requires coordinated updates across:
- `src/agents/items.py` (RunItem types and conversions)
- `src/agents/run_internal/run_steps.py` (ProcessedResponse and tool run structs)
- `src/agents/run_internal/turn_resolution.py` (model output processing, run item extraction)
- `src/agents/run_internal/tool_execution.py` and `src/agents/run_internal/tool_planning.py`
- `src/agents/run_internal/items.py` (normalization, dedupe, approval filtering)
- `src/agents/stream_events.py` (stream event names)
- `src/agents/run_state.py` (RunState serialization/deserialization)
- `src/agents/run_internal/session_persistence.py` (session save/rewind)
- If the serialized RunState shape changes, bump `CURRENT_SCHEMA_VERSION` in `src/agents/run_state.py` and update serialization/deserialization accordingly.

## Operation Guide

### Prerequisites
Expand Down
36 changes: 31 additions & 5 deletions examples/agent_patterns/agents_as_tools_conditional.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,9 @@

from pydantic import BaseModel

from agents import Agent, AgentBase, RunContextWrapper, Runner, trace
from examples.auto_mode import input_with_fallback
from agents import Agent, AgentBase, ModelSettings, RunContextWrapper, Runner, trace
from agents.tool import function_tool
from examples.auto_mode import confirm_with_fallback, input_with_fallback

"""
This example demonstrates the agents-as-tools pattern with conditional tool enabling.
Expand All @@ -26,10 +27,18 @@ def european_enabled(ctx: RunContextWrapper[AppContext], agent: AgentBase) -> bo
return ctx.context.language_preference == "european"


@function_tool(needs_approval=True)
async def get_user_name() -> str:
print("Getting the user's name...")
return "Kaz"


# Create specialized agents
spanish_agent = Agent(
name="spanish_agent",
instructions="You respond in Spanish. Always reply to the user's question in Spanish.",
instructions="You respond in Spanish. Always reply to the user's question in Spanish. You must call all the tools to best answer the user's question.",
model_settings=ModelSettings(tool_choice="required"),
tools=[get_user_name],
)

french_agent = Agent(
Expand All @@ -55,6 +64,7 @@ def european_enabled(ctx: RunContextWrapper[AppContext], agent: AgentBase) -> bo
tool_name="respond_spanish",
tool_description="Respond to the user's question in Spanish",
is_enabled=True, # Always enabled
needs_approval=True, # HITL
),
french_agent.as_tool(
tool_name="respond_french",
Expand Down Expand Up @@ -109,8 +119,24 @@ async def main():
input=user_request,
context=context.context,
)

print(f"\nResponse:\n{result.final_output}")
while result.interruptions:

async def confirm(question: str) -> bool:
return confirm_with_fallback(f"{question} (y/n): ", default=True)

state = result.to_state()
for interruption in result.interruptions:
prompt = f"\nDo you approve this tool call: {interruption.name} with arguments {interruption.arguments}?"
confirmed = await confirm(prompt)
if confirmed:
state.approve(interruption)
print(f"✓ Approved: {interruption.name}")
else:
state.reject(interruption)
print(f"✗ Rejected: {interruption.name}")
result = await Runner.run(orchestrator, state)

print(f"\nResponse:\n{result.final_output}")


if __name__ == "__main__":
Expand Down
137 changes: 137 additions & 0 deletions examples/agent_patterns/human_in_the_loop.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
"""Human-in-the-loop example with tool approval.

This example demonstrates how to:
1. Define tools that require approval before execution
2. Handle interruptions when tool approval is needed
3. Serialize/deserialize run state to continue execution later
4. Approve or reject tool calls based on user input
"""

import asyncio
import json
from pathlib import Path

from agents import Agent, Runner, RunState, function_tool
from examples.auto_mode import confirm_with_fallback


@function_tool
async def get_weather(city: str) -> str:
"""Get the weather for a given city.

Args:
city: The city to get weather for.

Returns:
Weather information for the city.
"""
return f"The weather in {city} is sunny"


async def _needs_temperature_approval(_ctx, params, _call_id) -> bool:
"""Check if temperature tool needs approval."""
return "Oakland" in params.get("city", "")


@function_tool(
# Dynamic approval: only require approval for Oakland
needs_approval=_needs_temperature_approval
)
async def get_temperature(city: str) -> str:
"""Get the temperature for a given city.

Args:
city: The city to get temperature for.

Returns:
Temperature information for the city.
"""
return f"The temperature in {city} is 20° Celsius"


# Main agent with tool that requires approval
agent = Agent(
name="Weather Assistant",
instructions=(
"You are a helpful weather assistant. "
"Answer questions about weather and temperature using the available tools."
),
tools=[get_weather, get_temperature],
)

RESULT_PATH = Path(".cache/agent_patterns/human_in_the_loop/result.json")


async def confirm(question: str) -> bool:
"""Prompt user for yes/no confirmation.

Args:
question: The question to ask.

Returns:
True if user confirms, False otherwise.
"""
return confirm_with_fallback(f"{question} (y/n): ", default=True)


async def main():
"""Run the human-in-the-loop example."""
result = await Runner.run(
agent,
"What is the weather and temperature in Oakland?",
)

has_interruptions = len(result.interruptions) > 0

while has_interruptions:
print("\n" + "=" * 80)
print("Run interrupted - tool approval required")
print("=" * 80)

# Storing state to file (demonstrating serialization)
state = result.to_state()
state_json = state.to_json()
RESULT_PATH.parent.mkdir(parents=True, exist_ok=True)
with RESULT_PATH.open("w") as f:
json.dump(state_json, f, indent=2)

print(f"State saved to {RESULT_PATH}")

# From here on you could run things on a different thread/process

# Reading state from file (demonstrating deserialization)
print(f"Loading state from {RESULT_PATH}")
with RESULT_PATH.open() as f:
stored_state_json = json.load(f)

state = await RunState.from_json(agent, stored_state_json)

# Process each interruption
for interruption in result.interruptions:
print("\nTool call details:")
print(f" Agent: {interruption.agent.name}")
print(f" Tool: {interruption.name}")
print(f" Arguments: {interruption.arguments}")

confirmed = await confirm("\nDo you approve this tool call?")

if confirmed:
print(f"✓ Approved: {interruption.name}")
state.approve(interruption)
else:
print(f"✗ Rejected: {interruption.name}")
state.reject(interruption)

# Resume execution with the updated state
print("\nResuming agent execution...")
result = await Runner.run(agent, state)
has_interruptions = len(result.interruptions) > 0

print("\n" + "=" * 80)
print("Final Output:")
print("=" * 80)
print(result.final_output)


if __name__ == "__main__":
asyncio.run(main())
119 changes: 119 additions & 0 deletions examples/agent_patterns/human_in_the_loop_stream.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
"""Human-in-the-loop example with streaming.

This example demonstrates the human-in-the-loop (HITL) pattern with streaming.
The agent will pause execution when a tool requiring approval is called,
allowing you to approve or reject the tool call before continuing.

The streaming version provides real-time feedback as the agent processes
the request, then pauses for approval when needed.
"""

import asyncio

from agents import Agent, Runner, function_tool
from examples.auto_mode import confirm_with_fallback


async def _needs_temperature_approval(_ctx, params, _call_id) -> bool:
"""Check if temperature tool needs approval."""
return "Oakland" in params.get("city", "")


@function_tool(
# Dynamic approval: only require approval for Oakland
needs_approval=_needs_temperature_approval
)
async def get_temperature(city: str) -> str:
"""Get the temperature for a given city.

Args:
city: The city to get temperature for.

Returns:
Temperature information for the city.
"""
return f"The temperature in {city} is 20° Celsius"


@function_tool
async def get_weather(city: str) -> str:
"""Get the weather for a given city.

Args:
city: The city to get weather for.

Returns:
Weather information for the city.
"""
return f"The weather in {city} is sunny."


async def confirm(question: str) -> bool:
"""Prompt user for yes/no confirmation.

Args:
question: The question to ask.

Returns:
True if user confirms, False otherwise.
"""
return confirm_with_fallback(f"{question} (y/n): ", default=True)


async def main():
"""Run the human-in-the-loop example."""
main_agent = Agent(
name="Weather Assistant",
instructions=(
"You are a helpful weather assistant. "
"Answer questions about weather and temperature using the available tools."
),
tools=[get_temperature, get_weather],
)

# Run the agent with streaming
result = Runner.run_streamed(
main_agent,
"What is the weather and temperature in Oakland?",
)
async for _ in result.stream_events():
pass # Process streaming events silently or could print them

# Handle interruptions
while len(result.interruptions) > 0:
print("\n" + "=" * 80)
print("Human-in-the-loop: approval required for the following tool calls:")
print("=" * 80)

state = result.to_state()

for interruption in result.interruptions:
print("\nTool call details:")
print(f" Agent: {interruption.agent.name}")
print(f" Tool: {interruption.name}")
print(f" Arguments: {interruption.arguments}")

confirmed = await confirm("\nDo you approve this tool call?")

if confirmed:
print(f"✓ Approved: {interruption.name}")
state.approve(interruption)
else:
print(f"✗ Rejected: {interruption.name}")
state.reject(interruption)

# Resume execution with streaming
print("\nResuming agent execution...")
result = Runner.run_streamed(main_agent, state)
async for _ in result.stream_events():
pass # Process streaming events silently or could print them

print("\n" + "=" * 80)
print("Final Output:")
print("=" * 80)
print(result.final_output)
print("\nDone!")


if __name__ == "__main__":
asyncio.run(main())
Loading