Fix gpt-oss tool calling: pass tool args and tool responses as json#19620
Fix gpt-oss tool calling: pass tool args and tool responses as json#19620matteoserva wants to merge 1 commit intoggml-org:masterfrom
Conversation
aldehir
left a comment
There was a problem hiding this comment.
Need a better way to handle tool content. This will break clients that return plain strings, which is perfectly valid.
Also, have you tested this against the Unsloth template? They patched this particular issue, and it appears they handle both cases but needs to be verified.
Additionally, please adjust the code style to match the rest of the file.
| std::string response = msg.value("content",std::string()); | ||
| adjusted_message["content"] = json::parse(response); |
There was a problem hiding this comment.
There is no guarantee clients return back valid JSON as the tool response. Perhaps fallback to string when parsing fails.
|
You were right. The updated model fixed everything. Thanks very much |
Unfortunately, the template fix wasn't picked up by everyone, e.g., most importantly upstream (openai), and even https://huggingface.co/ggml-org/gpt-oss-120b-GGUF hasn't. So IMO the proposed code fix (with suggested changes by @aldehir ) could make sense and shouldn't interfere with the unsloth template fix. btw: I couldn't really understand why some reported on reddit bad gpt-oss performance and always attributed it to some kind of openai bashing and/or bots. Now I see that issues like that are indistinguishable from bad model performance for average users. It would be great if we had some automated way to verify whether the model sees valid prompts; maybe in some future we have a small set of end2end testcases per model where API requests are verifiable against given prompts. |
|
@bfroemel that's exactly how the autoparser project started :) I wanted to have llama.cpp as a reliable local agentic coding backend and to my frustration I realized that like 5 theoretically supported models that I tested in a row broke down within the first few tool call requests. Then I started digging and I realized that a lot of what is commonly attribute to "bad model quality" is actually bad parser behavior. |
|
I agree, this should be fixed for all templates. I'll try to bring this PR to the finish line when I find time, unless @matteoserva wants to. An easy workaround is to use the Unsloth template for now. |
Yes, please do. :) |
gpt-oss template expects json objects for tool arguments and tool responses.
llama.cpp passes strings, causing them to be escaped twice.
fixes #19520
supersedes #19553