Don't crash BYOK/custom models when the response hits the length limit by vritant24 · Pull Request #324121 · microsoft/vscode

vritant24 · 2026-07-02T22:41:58Z

Fixes #319592

When a custom/BYOK model returns finish_reason: "length" (hit its context
window), we threw Error: Response too long. and discarded everything the model
already streamed.

By the time we get the Length result the partial text has already been
streamed via the finished callback, so instead of throwing we just resolve and
keep it. This matches how the rest of the codebase already handles Length
(defaultIntentRequestHandler, agentIntent, codeMapper all use the partial
text rather than failing).

Copilot

Pull request overview

This PR fixes issue #319592, where custom/BYOK OpenAI-compatible endpoints that return finish_reason: "length" (i.e. hit their context window) caused the Copilot language-model wrapper to throw Error: Response too long. and discard all text the model had already streamed. The fix intercepts the ChatFetchResponseType.Length result in CopilotLanguageModelWrapper._provideLanguageModelResponse and resolves gracefully (returning undefined) instead of throwing, since the partial text was already reported to the consumer via the streaming finishedCb. A warning is logged, and a unit test verifies the truncated response no longer surfaces as a hard failure.

Changes:

Treat a Length finish reason as a successful (truncated) response rather than throwing, keeping the already-streamed partial text.
Log a warning when a response is truncated due to the length limit.
Add a unit test asserting provideLanguageModelResponse does not throw on a Length response.

Show a summary per file

File	Description
extensions/copilot/src/extension/conversation/vscode-node/languageModelAccess.ts	Adds an early return for `ChatFetchResponseType.Length` before the failure path, logging a warning and returning `undefined` instead of throwing "Response too long."
extensions/copilot/src/extension/conversation/vscode-node/test/languageModelAccess.test.ts	Adds a "length finish reason" suite verifying that a truncated (`Length`) response does not throw.

Review details

Files reviewed: 2/2 changed files
Comments generated: 0
Review effort level: Medium

Add handling for truncated responses in CopilotLanguageModelWrapper

32bd236

Copilot AI review requested due to automatic review settings July 2, 2026 22:41

vritant24 self-assigned this Jul 2, 2026

Copilot started reviewing on behalf of vritant24 July 2, 2026 22:42 View session

Copilot AI reviewed Jul 2, 2026

View reviewed changes

vritant24 enabled auto-merge July 2, 2026 22:50

dmitrivMS approved these changes Jul 3, 2026

View reviewed changes

vritant24 merged commit f85c531 into main Jul 3, 2026
47 of 48 checks passed

vritant24 deleted the agents/investigate-unit-test-failure-fix branch July 3, 2026 00:47

vs-code-engineering Bot added this to the 1.128.0 milestone Jul 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Don't crash BYOK/custom models when the response hits the length limit#324121

Don't crash BYOK/custom models when the response hits the length limit#324121
vritant24 merged 1 commit into
mainfrom
agents/investigate-unit-test-failure-fix

vritant24 commented Jul 2, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

vritant24 commented Jul 2, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Review details

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants