Preserve non-text tool outputs in LiteLLM and chatcmpl converters #2214

ihower · 2025-12-20T10:05:18Z

Resolves: #2163

This PR fixes an issue where non-text tool outputs (such as ToolOutputImage) are dropped when using the LiteLLM and chatcmpl adapter.

Previously, in the tool output conversion logic, the adapter used:

extract_text_content(output_content)

which only keeps text and drops image, audio, and file inputs.

This PR switches it to:

extract_all_content(output_content)

so the following types are preserved:

input_text
input_image
input_audio
input_file

Backwards compatibility

I think this change should be mostly safe and backwards compatible

If the tool returns a str, behavior is unchanged.
If the tool returns a list or dict with only type="text", behavior is unchanged.

If the tool returns images or files:

Some model providers actually support and can consume them (e.g. Claude, Azure?).
Some model providers ignore them and only process the text part (e.g. OpenAI Chat Completions, Gemini, DeepSeek). This is the same result as the previous behavior where non-text parts were removed.

Example Code

The following example will work after this PR using Claude model.

from agents import ToolOutputImage, ToolOutputText, function_tool, Agent, Runner
from agents.extensions.models.litellm_model import LitellmModel
from typing import Union

import json
import base64
import litellm

@function_tool
async def retrieve_test_image() -> Union[ToolOutputImage, str]:
    path = "a_test_image.jpg"
    b64 = base64.b64encode(open(path, "rb").read()).decode()
    return [
        ToolOutputText(text="hello"),
        ToolOutputImage(image_url=f"data:image/jpeg;base64,{b64}", detail="high")
    ]

test_agent = Agent(
    name="Image Test Agent",
    instructions="You retrieve and describe images.",
    model=LitellmModel(model="anthropic/claude-sonnet-4-5-20250929"),
    tools=[retrieve_test_image],
)

result = Runner.run_sync(
    test_agent,
    "call retrieve_test_image. What do you see in this image and text?"
)
print(result.final_output)

# In this image, I can see....
# The function also returned the text output "hello".

Implementation Notes

To make this change type-safe, I introduced ExtendedChatCompletionToolMessageParam with a broader content type that matches the return value of extract_all_content. Without this, switching to extract_all_content directly results in a mypy error:

src/agents/models/chatcmpl_converter.py:547: error: Incompatible types (expression has type
"str | list[ChatCompletionContentPartTextParam | ChatCompletionContentPartImageParam | ChatCompletionContentPartInputAudioParam | File]",
TypedDict item "content" has type "str | Iterable[ChatCompletionContentPartTextParam]")

If this feels too heavy-weight, an alternative is to keep the existing type and ignore the typing error explicitly:

"content": cls.extract_all_content(output_content),  # type: ignore[typeddict-item]

I can switch to this approach instead if it’s preferred.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2025-12-20T10:08:02Z

src/agents/models/chatcmpl_converter.py

+                msg: ExtendedChatCompletionToolMessageParam = {
                    "role": "tool",
                    "tool_call_id": func_output["call_id"],
-                    "content": cls.extract_text_content(output_content),
+                    "content": cls.extract_all_content(output_content),


Avoid sending non-text tool content to OpenAI Chat Completions

Tool outputs are now passed through extract_all_content, which includes image/audio/file parts, directly into the tool message payload (lines 556-559). The OpenAI ChatCompletions API only accepts text for tool messages (ChatCompletionToolMessageParam is limited to str or text parts), so when a tool returns a ToolOutputImage, input_audio, or file result and this converter is used by ChatCompletionsModel, the request will be rejected with an invalid payload instead of gracefully ignoring the non-text content as before. This is a regression for any OpenAI/Azure chat-completions call whose tools emit media output.

Useful? React with 👍 / 👎.

We cannot accept the code that could potentially break existing OpenAI Chat Completions API code.

I have tested this with the OpenAI chat completions endpoint. In practice, the server ignores unsupported non-text tool content instead of returning an API error, so this does not seem to break the API.

Before this change, non-text tool outputs were silently dropped by the SDK. Now they are preserved. If a provider ignores them, the behavior is the same. If a provider returns an error, that is actually better because it clearly shows that the provider does not support media in tool outputs.

For this reason, I don’t think the SDK needs to filter this content on behalf of developers. This flexibility can be left to developers, since some providers do support images in tool outputs.

FYI, before v0.3.3 the SDK did not filter this content either. The filtering was introduced later during the openai-python upgrade, and it seems to have been added mainly to satisfy type checking rather than due to a strict API requirement.

This converter is primarily used for OpenAI's Chat Completions API model, so even if the server endpoint ignores the data pattern, making the data compatible with underlying OpenAI model interface is still important. Also, the server behavior could be changed in the future because the current behavior where the endpoint ignores the unsupported data structure is not clearly mentioned in the public documents.

For the benefit of LiteLLM users, I think enabling the callers of this converter to customize the behavior here by adding overload methods, which accept customization option may be a good approach.

ihower added 2 commits December 20, 2025 17:52

fix: Preserve non-text tool outputs in LiteLLM and chatcmpl converter

70ca9c9

feat: Support media content in tool message outputs

9ba59f3

chatgpt-codex-connector bot reviewed Dec 20, 2025

View reviewed changes

seratch added enhancement New feature or request feature:lite-llm labels Dec 22, 2025

seratch added this to the 0.7.x milestone Dec 22, 2025

seratch marked this pull request as draft December 22, 2025 02:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Preserve non-text tool outputs in LiteLLM and chatcmpl converters #2214

Preserve non-text tool outputs in LiteLLM and chatcmpl converters #2214

ihower commented Dec 20, 2025 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Dec 20, 2025

Uh oh!

seratch Dec 22, 2025

Uh oh!

ihower Dec 22, 2025 •

edited

Loading

Uh oh!

seratch Dec 22, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Preserve non-text tool outputs in LiteLLM and chatcmpl converters #2214

Are you sure you want to change the base?

Preserve non-text tool outputs in LiteLLM and chatcmpl converters #2214

Conversation

ihower commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Backwards compatibility

Example Code

Implementation Notes

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

seratch Dec 22, 2025

Choose a reason for hiding this comment

Uh oh!

ihower Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

seratch Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ihower commented Dec 20, 2025 •

edited

Loading

ihower Dec 22, 2025 •

edited

Loading

seratch Dec 22, 2025 •

edited

Loading