Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
5aab5f4
feat: implement human-in-the-loop (HITL) approval infrastructure
mjschock Nov 1, 2025
b3334b3
feat: integrate HITL approval checking into run execution loop
mjschock Nov 1, 2025
59bfbd5
feat: add RunState parameter support to Runner.run() methods
mjschock Nov 1, 2025
488f692
feat: add to_state() method to RunResult for resuming runs
mjschock Nov 1, 2025
9830de1
feat: add streaming HITL support and complete human-in-the-loop imple…
mjschock Nov 1, 2025
c89e5c5
fix: prime server conversation tracker in streaming path to prevent m…
mjschock Nov 1, 2025
529aa17
ci: fix issues that surfaced in CI
mjschock Nov 4, 2025
b6a3b5c
fix: Bring up coverage to minimum and add session hitl examples
mjschock Nov 4, 2025
c491721
fix: ensure RunState serialization compatibility with openai-agents-js
mjschock Nov 8, 2025
c69d05f
fix: standardize call_id extraction in ServerConversationTracker
mjschock Nov 8, 2025
1c39142
test: add tests for RunState resumption and serialization to bring up…
mjschock Nov 9, 2025
ce49285
fix: Updates following rebase, include test coverage
mjschock Nov 14, 2025
ff246c3
fix: address issues around resuming run state with conversation history
mjschock Nov 17, 2025
8f2c650
fix: address duplicating session history issue mentioned by @chatgpt-…
mjschock Nov 17, 2025
181ca79
fix: update RunState with current turn persisted item tracking
mjschock Nov 17, 2025
8a5e648
fix: addressing edge cases when resuming
mjschock Nov 17, 2025
0421466
fix: addressing edge cases when resuming (continued)
mjschock Nov 17, 2025
44fd3bc
fix: addressing rebase issues
mjschock Nov 21, 2025
7c8f95e
fix: improving parity with openai-agent-js hitl functionality
mjschock Nov 21, 2025
162c788
fix: bring coverage back up, addressing edge cases
mjschock Nov 22, 2025
3a8a510
fix: cleanup
mjschock Nov 23, 2025
ca22b73
fix: rename summary_message back to assistant_message
mjschock Nov 23, 2025
889ecd4
fix: enhance agent state management during resume, ensuring correct a…
mjschock Nov 26, 2025
db83c36
fix: finish up human-in-the-loop port
mjschock Dec 6, 2025
62f564d
fix: add auto_previous_response_id parameter to test_start_streaming_…
mjschock Dec 6, 2025
1ff9e2e
fix: typing updates to pass `make old_version_tests`
mjschock Dec 6, 2025
8cbb446
fix: remove dead code and add failing hitl error scenarios
mjschock Dec 9, 2025
f37cb22
fix: address failing hitl error scenarios
mjschock Dec 9, 2025
e3e2a8e
fix: change logging level for _ServerConversationTracker creation to …
mjschock Dec 9, 2025
da6ac80
fix: pass context_wrapper to _coerce_apply_patch_operation in ApplyPa…
mjschock Dec 10, 2025
1f5ab75
fix: add test to ensure current turn is preserved when converting Run…
mjschock Dec 10, 2025
25f08c8
fix: update RunResult to track current turn number and ensure it is p…
mjschock Dec 10, 2025
f7d0d38
fix: add tests to ensure ToolApprovalItem hashability and preserve pe…
mjschock Dec 10, 2025
527a433
refactor: simplify condition for current turn persisted item count in…
mjschock Dec 10, 2025
b955cbf
test: add tests to preserve tool output types during run state serial…
mjschock Dec 10, 2025
e3fcf6b
fix: enhance output item conversion logic to preserve non-function ca…
mjschock Dec 10, 2025
594c2cd
fix: update test to use ResponseCustomToolCall for apply_patch call i…
mjschock Dec 11, 2025
7d92c61
fix: remove unused import of RunImpl in AgentRunner
mjschock Dec 18, 2025
225880c
test: add test to ensure prepare_input converts Pydantic models to JS…
mjschock Dec 20, 2025
099d96c
fix: improve input item conversion logic to handle both dicts and Run…
mjschock Dec 20, 2025
08af802
test: add test to verify prepare_input tracks sent items to prevent d…
mjschock Dec 20, 2025
6cc323a
fix: track sent items in _ServerConversationTracker to prevent duplic…
mjschock Dec 20, 2025
dde699a
test: add test to verify normalization of tool outputs by stripping p…
mjschock Dec 20, 2025
03cd28a
fix: ensure consistent normalization of input items by always calling…
mjschock Dec 20, 2025
b4c5f74
test: add test to verify filtering of items by fingerprint in prepare…
mjschock Dec 20, 2025
5a36beb
feat: add original_input_fingerprints to _ServerConversationTracker f…
mjschock Dec 21, 2025
71d0b53
test: add test for interrupted run state to ensure model response pre…
mjschock Dec 21, 2025
7b4f181
fix: add processed_response to RunImpl and AgentRunner for improved s…
mjschock Dec 21, 2025
ff0b7df
test: add test to verify no duplicates are saved when resuming with a…
mjschock Dec 21, 2025
275a02a
fix: refine turn management and input saving logic to prevent duplica…
mjschock Dec 21, 2025
183d3c5
test: add approval check test to ensure function tools requiring appr…
mjschock Dec 21, 2025
2a71b86
fix: reorganize tool execution order to prioritize approval checks fo…
mjschock Dec 21, 2025
a0635a4
test: add test to verify interruption is raised when resuming with un…
mjschock Dec 21, 2025
7ea1cf3
fix: implement tool approval checks to handle interruptions before pr…
mjschock Dec 21, 2025
880888f
test: add test to verify that resuming after approval only executes u…
mjschock Dec 21, 2025
ac49221
fix: enhance tool execution logic to prevent re-execution of already …
mjschock Dec 21, 2025
c766e13
test: add test to ensure providerData is preserved during MCP approva…
mjschock Dec 21, 2025
980f49e
fix: preserve and restore providerData during normalization for MCP a…
mjschock Dec 21, 2025
e3efb5d
Revert "fix: refine turn management and input saving logic to prevent…
mjschock Dec 22, 2025
7d9afa0
Revert "test: add test to verify no duplicates are saved when resumin…
mjschock Dec 22, 2025
608abe3
fix: address lint issues after rebase
mjschock Dec 22, 2025
e4dec73
test: add test to verify no duplicates are created when resuming with…
mjschock Dec 22, 2025
28c008e
fix: refine logic to prevent duplicate item persistence during resump…
mjschock Dec 22, 2025
4248b87
test: add test to ensure structured tool outputs are preserved during…
mjschock Dec 22, 2025
8341547
fix: enhance output handling in run state serialization to preserve s…
mjschock Dec 22, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
140 changes: 140 additions & 0 deletions examples/agent_patterns/human_in_the_loop.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
"""Human-in-the-loop example with tool approval.

This example demonstrates how to:
1. Define tools that require approval before execution
2. Handle interruptions when tool approval is needed
3. Serialize/deserialize run state to continue execution later
4. Approve or reject tool calls based on user input
"""

import asyncio
import json

from agents import Agent, Runner, RunState, ToolApprovalItem, function_tool


@function_tool
async def get_weather(city: str) -> str:
"""Get the weather for a given city.

Args:
city: The city to get weather for.

Returns:
Weather information for the city.
"""
return f"The weather in {city} is sunny"


async def _needs_temperature_approval(_ctx, params, _call_id) -> bool:
"""Check if temperature tool needs approval."""
return "Oakland" in params.get("city", "")


@function_tool(
# Dynamic approval: only require approval for Oakland
needs_approval=_needs_temperature_approval
)
async def get_temperature(city: str) -> str:
"""Get the temperature for a given city.

Args:
city: The city to get temperature for.

Returns:
Temperature information for the city.
"""
return f"The temperature in {city} is 20° Celsius"


# Main agent with tool that requires approval
agent = Agent(
name="Weather Assistant",
instructions=(
"You are a helpful weather assistant. "
"Answer questions about weather and temperature using the available tools."
),
tools=[get_weather, get_temperature],
)


async def confirm(question: str) -> bool:
"""Prompt user for yes/no confirmation.

Args:
question: The question to ask.

Returns:
True if user confirms, False otherwise.
"""
# Note: In a real application, you would use proper async input
# For now, using synchronous input with run_in_executor
loop = asyncio.get_event_loop()
answer = await loop.run_in_executor(None, input, f"{question} (y/n): ")
normalized = answer.strip().lower()
return normalized in ("y", "yes")


async def main():
"""Run the human-in-the-loop example."""
result = await Runner.run(
agent,
"What is the weather and temperature in Oakland?",
)

has_interruptions = len(result.interruptions) > 0

while has_interruptions:
print("\n" + "=" * 80)
print("Run interrupted - tool approval required")
print("=" * 80)

# Storing state to file (demonstrating serialization)
state = result.to_state()
state_json = state.to_json()
with open("result.json", "w") as f:
json.dump(state_json, f, indent=2)

print("State saved to result.json")

# From here on you could run things on a different thread/process

# Reading state from file (demonstrating deserialization)
print("Loading state from result.json")
with open("result.json") as f:
stored_state_json = json.load(f)

state = await RunState.from_json(agent, stored_state_json)

# Process each interruption
for interruption in result.interruptions:
if not isinstance(interruption, ToolApprovalItem):
continue

print("\nTool call details:")
print(f" Agent: {interruption.agent.name}")
print(f" Tool: {interruption.name}")
print(f" Arguments: {interruption.arguments}")

confirmed = await confirm("\nDo you approve this tool call?")

if confirmed:
print(f"✓ Approved: {interruption.name}")
state.approve(interruption)
else:
print(f"✗ Rejected: {interruption.name}")
state.reject(interruption)

# Resume execution with the updated state
print("\nResuming agent execution...")
result = await Runner.run(agent, state)
has_interruptions = len(result.interruptions) > 0

print("\n" + "=" * 80)
print("Final Output:")
print("=" * 80)
print(result.final_output)


if __name__ == "__main__":
asyncio.run(main())
123 changes: 123 additions & 0 deletions examples/agent_patterns/human_in_the_loop_stream.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
"""Human-in-the-loop example with streaming.

This example demonstrates the human-in-the-loop (HITL) pattern with streaming.
The agent will pause execution when a tool requiring approval is called,
allowing you to approve or reject the tool call before continuing.

The streaming version provides real-time feedback as the agent processes
the request, then pauses for approval when needed.
"""

import asyncio

from agents import Agent, Runner, ToolApprovalItem, function_tool


async def _needs_temperature_approval(_ctx, params, _call_id) -> bool:
"""Check if temperature tool needs approval."""
return "Oakland" in params.get("city", "")


@function_tool(
# Dynamic approval: only require approval for Oakland
needs_approval=_needs_temperature_approval
)
async def get_temperature(city: str) -> str:
"""Get the temperature for a given city.

Args:
city: The city to get temperature for.

Returns:
Temperature information for the city.
"""
return f"The temperature in {city} is 20° Celsius"


@function_tool
async def get_weather(city: str) -> str:
"""Get the weather for a given city.

Args:
city: The city to get weather for.

Returns:
Weather information for the city.
"""
return f"The weather in {city} is sunny."


async def confirm(question: str) -> bool:
"""Prompt user for yes/no confirmation.

Args:
question: The question to ask.

Returns:
True if user confirms, False otherwise.
"""
loop = asyncio.get_event_loop()
answer = await loop.run_in_executor(None, input, f"{question} (y/n): ")
return answer.strip().lower() in ["y", "yes"]


async def main():
"""Run the human-in-the-loop example."""
main_agent = Agent(
name="Weather Assistant",
instructions=(
"You are a helpful weather assistant. "
"Answer questions about weather and temperature using the available tools."
),
tools=[get_temperature, get_weather],
)

# Run the agent with streaming
result = Runner.run_streamed(
main_agent,
"What is the weather and temperature in Oakland?",
)
async for _ in result.stream_events():
pass # Process streaming events silently or could print them

# Handle interruptions
while len(result.interruptions) > 0:
print("\n" + "=" * 80)
print("Human-in-the-loop: approval required for the following tool calls:")
print("=" * 80)

state = result.to_state()

for interruption in result.interruptions:
if not isinstance(interruption, ToolApprovalItem):
continue

print("\nTool call details:")
print(f" Agent: {interruption.agent.name}")
print(f" Tool: {interruption.name}")
print(f" Arguments: {interruption.arguments}")

confirmed = await confirm("\nDo you approve this tool call?")

if confirmed:
print(f"✓ Approved: {interruption.name}")
state.approve(interruption)
else:
print(f"✗ Rejected: {interruption.name}")
state.reject(interruption)

# Resume execution with streaming
print("\nResuming agent execution...")
result = Runner.run_streamed(main_agent, state)
async for _ in result.stream_events():
pass # Process streaming events silently or could print them

print("\n" + "=" * 80)
print("Final Output:")
print("=" * 80)
print(result.final_output)
print("\nDone!")


if __name__ == "__main__":
asyncio.run(main())
117 changes: 117 additions & 0 deletions examples/memory/memory_session_hitl_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
"""
Example demonstrating SQLite in-memory session with human-in-the-loop (HITL) tool approval.

This example shows how to use SQLite in-memory session memory combined with
human-in-the-loop tool approval. The session maintains conversation history while
requiring approval for specific tool calls.
"""

import asyncio

from agents import Agent, Runner, SQLiteSession, function_tool


async def _needs_approval(_ctx, _params, _call_id) -> bool:
"""Always require approval for weather tool."""
return True


@function_tool(needs_approval=_needs_approval)
def get_weather(location: str) -> str:
"""Get weather for a location.

Args:
location: The location to get weather for

Returns:
Weather information as a string
"""
# Simulated weather data
weather_data = {
"san francisco": "Foggy, 58°F",
"oakland": "Sunny, 72°F",
"new york": "Rainy, 65°F",
}
# Check if any city name is in the provided location string
location_lower = location.lower()
for city, weather in weather_data.items():
if city in location_lower:
return weather
return f"Weather data not available for {location}"


async def prompt_yes_no(question: str) -> bool:
"""Prompt user for yes/no answer.

Args:
question: The question to ask

Returns:
True if user answered yes, False otherwise
"""
print(f"\n{question} (y/n): ", end="", flush=True)
loop = asyncio.get_event_loop()
answer = await loop.run_in_executor(None, input)
normalized = answer.strip().lower()
return normalized in ("y", "yes")


async def main():
# Create an agent with a tool that requires approval
agent = Agent(
name="HITL Assistant",
instructions="You help users with information. Always use available tools when appropriate. Keep responses concise.",
tools=[get_weather],
)

# Create an in-memory SQLite session instance that will persist across runs
session = SQLiteSession(":memory:")
session_id = session.session_id

print("=== Memory Session + HITL Example ===")
print(f"Session id: {session_id}")
print("Enter a message to chat with the agent. Submit an empty line to exit.")
print("The agent will ask for approval before using tools.\n")

while True:
# Get user input
print("You: ", end="", flush=True)
loop = asyncio.get_event_loop()
user_message = await loop.run_in_executor(None, input)

if not user_message.strip():
break

# Run the agent
result = await Runner.run(agent, user_message, session=session)

# Handle interruptions (tool approvals)
while result.interruptions:
# Get the run state
state = result.to_state()

for interruption in result.interruptions:
tool_name = interruption.raw_item.name # type: ignore[union-attr]
args = interruption.raw_item.arguments or "(no arguments)" # type: ignore[union-attr]

approved = await prompt_yes_no(
f"Agent {interruption.agent.name} wants to call '{tool_name}' with {args}. Approve?"
)

if approved:
state.approve(interruption)
print("Approved tool call.")
else:
state.reject(interruption)
print("Rejected tool call.")

# Resume the run with the updated state
result = await Runner.run(agent, state, session=session)

# Display the response
reply = result.final_output or "[No final output produced]"
print(f"Assistant: {reply}\n")


if __name__ == "__main__":
asyncio.run(main())
Loading