Skip to content

Fix gpt-oss tool calling: pass tool args and tool responses as json#19620

Closed
matteoserva wants to merge 1 commit intoggml-org:masterfrom
matteoserva:fix-gptoss-tools
Closed

Fix gpt-oss tool calling: pass tool args and tool responses as json#19620
matteoserva wants to merge 1 commit intoggml-org:masterfrom
matteoserva:fix-gptoss-tools

Conversation

@matteoserva
Copy link
Copy Markdown
Contributor

gpt-oss template expects json objects for tool arguments and tool responses.
llama.cpp passes strings, causing them to be escaped twice.

fixes #19520
supersedes #19553

@matteoserva matteoserva requested a review from pwilkin as a code owner February 14, 2026 08:06
Copy link
Copy Markdown
Contributor

@aldehir aldehir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a better way to handle tool content. This will break clients that return plain strings, which is perfectly valid.

Also, have you tested this against the Unsloth template? They patched this particular issue, and it appears they handle both cases but needs to be verified.

Additionally, please adjust the code style to match the rest of the file.

Comment thread common/chat.cpp
Comment on lines +2036 to +2037
std::string response = msg.value("content",std::string());
adjusted_message["content"] = json::parse(response);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no guarantee clients return back valid JSON as the tool response. Perhaps fallback to string when parsing fails.

@matteoserva
Copy link
Copy Markdown
Contributor Author

You were right. The updated model fixed everything. Thanks very much

@bfroemel
Copy link
Copy Markdown

You were right. The updated model fixed everything. Thanks very much

Unfortunately, the template fix wasn't picked up by everyone, e.g., most importantly upstream (openai), and even https://huggingface.co/ggml-org/gpt-oss-120b-GGUF hasn't. So IMO the proposed code fix (with suggested changes by @aldehir ) could make sense and shouldn't interfere with the unsloth template fix.

btw: I couldn't really understand why some reported on reddit bad gpt-oss performance and always attributed it to some kind of openai bashing and/or bots. Now I see that issues like that are indistinguishable from bad model performance for average users. It would be great if we had some automated way to verify whether the model sees valid prompts; maybe in some future we have a small set of end2end testcases per model where API requests are verifiable against given prompts.

@pwilkin
Copy link
Copy Markdown
Member

pwilkin commented Feb 18, 2026

@bfroemel that's exactly how the autoparser project started :) I wanted to have llama.cpp as a reliable local agentic coding backend and to my frustration I realized that like 5 theoretically supported models that I tested in a row broke down within the first few tool call requests. Then I started digging and I realized that a lot of what is commonly attribute to "bad model quality" is actually bad parser behavior.

@aldehir
Copy link
Copy Markdown
Contributor

aldehir commented Feb 18, 2026

I agree, this should be fixed for all templates. I'll try to bring this PR to the finish line when I find time, unless @matteoserva wants to. An easy workaround is to use the Unsloth template for now.

@matteoserva
Copy link
Copy Markdown
Contributor Author

I agree, this should be fixed for all templates. I'll try to bring this PR to the finish line when I find time, unless @matteoserva wants to. An easy workaround is to use the Unsloth template for now.

Yes, please do. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Eval bug: gpt-oss tool call template escapes arguments and responses twice

4 participants