Summary
Claude models (e.g. claude-opus-4.6) fail with "Failed to get response from the AI model; retried 5 times" after a custom tool call completes successfully. GPT models work fine with the same code. The issue is 100% reproducible.
Root cause from server logs
The CLI subprocess server log (~/.copilot/logs/process-*.log) shows:
StreamingErrorRetryProcessor: Non-API error detected (retry 1): missing finish_reason for choice 0
...
StreamingErrorRetryProcessor: Non-API error detected (retry 6): missing finish_reason for choice 0
Ignoring event of kind: turn_failed
The server log also shows:
"WEBSOCKET_RESPONSES": "false",
"copilot_cli_websocket_responses": "null",
This suggests the SDK CLI subprocess is being routed through the Chat Completions API path (not the Responses API). The Copilot proxy's Claude → Chat Completions format adapter appears to not set finish_reason in the streaming response after tool results are submitted, causing StreamingErrorRetryProcessor to reject it as invalid.
This does not affect VS Code Copilot Chat or gh copilot CLI, which use the Responses API path where finish_reason is not applicable.
Minimal repro
import { CopilotClient, defineTool } from "@github/copilot-sdk";
import * as fs from "node:fs";
import * as fsp from "node:fs/promises";
import * as path from "node:path";
import * as os from "node:os";
async function runTest(model: string) {
const configDir = path.join(os.tmpdir(), `sdk-repro-${model}-${Date.now()}`);
await fsp.mkdir(configDir, { recursive: true });
// Resolve native binary path
const cliPath = [
path.join(__dirname, "..", "node_modules", "@github", "copilot-darwin-arm64", "copilot"),
path.join(__dirname, "..", "node_modules", "@github", "copilot-linux-x64", "copilot"),
].find(c => fs.existsSync(c));
if (!cliPath) throw new Error("No native copilot binary found");
const client = new CopilotClient({ cliPath });
await client.start();
const writeTool = defineTool("write_result", {
description: "Write a result string.",
parameters: {
type: "object" as const,
properties: { answer: { type: "string" as const, description: "The answer" } },
required: ["answer"],
},
skipPermission: true,
handler: async (args: { answer: string }) => {
console.log(` [handler] Tool called with: ${args.answer}`);
return `OK: received "${args.answer}"`;
},
});
const session = await client.createSession({
model,
configDir,
enableConfigDiscovery: false,
streaming: true,
systemMessage: {
mode: "replace" as const,
content: "You are a test assistant. Call the write_result tool with your answer.",
},
tools: [writeTool],
onPermissionRequest: () => ({ kind: "approved" as const }),
infiniteSessions: { enabled: false },
} as any);
try {
await session.sendAndWait(
{ prompt: "What is 2 + 2? Call write_result with your answer." },
60_000
);
console.log(` [${model}] SUCCESS`);
} catch (err) {
console.log(` [${model}] FAILED: ${err instanceof Error ? err.message : err}`);
}
await session.disconnect().catch(() => {});
await client.deleteSession(session.sessionId).catch(() => {});
await client.stop().catch(() => {});
}
async function main() {
for (const model of ["claude-opus-4.6", "gpt-5.4"]) {
console.log(`--- ${model} ---`);
await runTest(model);
}
}
main();
Output
--- claude-opus-4.6 ---
[handler] Tool called with: 4
[claude-opus-4.6] FAILED: Execution failed: Error: Failed to get response from the AI model;
retried 5 times (total retry wait time: 5.36 seconds) Last error: Unknown error
--- gpt-5.4 ---
[handler] Tool called with: 4
[gpt-5.4] SUCCESS
Note: The tool handler runs successfully and returns a result for both models. The failure happens when the SDK sends the tool result back and waits for the model's follow-up response.
Environment
@github/copilot-sdk: 0.2.2
@github/copilot (CLI binary): 1.0.27 (also reproduced on 1.0.25)
- Platform: macOS arm64
- Node.js: v22.x
- Tested models:
claude-opus-4.6, claude-sonnet-4, claude-haiku-4.5 (all fail); gpt-5.4, gpt-4.1 (all succeed)
Workaround
Catch the error after sendAndWait and check if the tool already wrote its output. If it did, treat the turn as successful and continue. This works because the tool call itself always completes — only the post-tool-call model response fails.
Likely fix
Either:
- Enable the Responses API path for SDK CLI subprocess sessions (currently feature-flagged off)
- Fix the Copilot proxy's Chat Completions adapter to set
finish_reason correctly for Claude tool call responses
Summary
Claude models (e.g.
claude-opus-4.6) fail with "Failed to get response from the AI model; retried 5 times" after a custom tool call completes successfully. GPT models work fine with the same code. The issue is 100% reproducible.Root cause from server logs
The CLI subprocess server log (
~/.copilot/logs/process-*.log) shows:The server log also shows:
This suggests the SDK CLI subprocess is being routed through the Chat Completions API path (not the Responses API). The Copilot proxy's Claude → Chat Completions format adapter appears to not set
finish_reasonin the streaming response after tool results are submitted, causingStreamingErrorRetryProcessorto reject it as invalid.This does not affect VS Code Copilot Chat or
gh copilotCLI, which use the Responses API path wherefinish_reasonis not applicable.Minimal repro
Output
Note: The tool handler runs successfully and returns a result for both models. The failure happens when the SDK sends the tool result back and waits for the model's follow-up response.
Environment
@github/copilot-sdk: 0.2.2@github/copilot(CLI binary): 1.0.27 (also reproduced on 1.0.25)claude-opus-4.6,claude-sonnet-4,claude-haiku-4.5(all fail);gpt-5.4,gpt-4.1(all succeed)Workaround
Catch the error after
sendAndWaitand check if the tool already wrote its output. If it did, treat the turn as successful and continue. This works because the tool call itself always completes — only the post-tool-call model response fails.Likely fix
Either:
finish_reasoncorrectly for Claude tool call responses