chore(auto mode): prototype for model router #2788

Sid200026 · 2026-01-12T10:20:23Z

No description provided.

Copilot

Pull request overview

This PR introduces a prototype model router that uses an external classification API to intelligently select which AI model should handle a user query based on whether it requires reasoning capabilities.

Changes:

Adds RouterDecisionFetcher class that calls an external router API to determine the best model for a given prompt
Integrates the router into AutomodeService to use routing decisions during endpoint resolution
Changes cache structure from storing a single endpoint to an array of endpoints per conversation

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File	Description
src/platform/endpoint/node/routerDecisionFetcher.ts	New class that fetches model routing decisions from an external API based on query requirements
src/platform/endpoint/node/automodeService.ts	Integrates router decision fetcher, updates cache to store multiple endpoints, and uses router to select models

Comments suppressed due to low confidence (3)

src/platform/endpoint/node/routerDecisionFetcher.ts:60

The error message refers to "Reasoning classifier" in multiple places (lines 50, 55, 59) but the class is named RouterDecisionFetcher and the purpose is model routing. The terminology should be consistent - either rename references to use "model router" or clarify the relationship between "reasoning classification" and "routing" in the class documentation.

				throw new Error(`Reasoning classifier API request failed: ${response.statusText}`);
			}

			const result: RouterDecisionResponse = await response.json();

			this._logService.trace(`Reasoning classifier prediction: ${result.predicted_label}, model: ${result.chosen_model} (confidence: ${(result.confidence * 100).toFixed(1)}%, scores: needs_reasoning=${(result.scores.needs_reasoning * 100).toFixed(1)}%, no_reasoning=${(result.scores.no_reasoning * 100).toFixed(1)}%) (latency_ms: ${result.latency_ms}, candidate models: ${result.candidate_models.join(', ')}, preferred models: ${preferredModels.join(', ')})`);

			return result.chosen_model;
		} catch (error) {
			this._logService.error('Reasoning classification failed', error);
			throw error;

src/platform/endpoint/node/automodeService.ts:173

The API call does not validate that the chosen_model from the router response exists in the knownEndpoints array before attempting to use it. If the router returns a model that's not in knownEndpoints, selectedModel will be undefined and the fallback at line 174-175 will catch it. However, it would be more robust to validate that the routedModel is actually available in knownEndpoints before setting selectedModel, to avoid potential issues if the fallback logic changes.

			const routedModel = await this._routerDecisionFetcher.getRoutedModel(chatRequest.prompt.trim(), availableModels, preferredModels);
			selectedModel = knownEndpoints.find(e => e.model === routedModel);
		}

src/platform/endpoint/node/automodeService.ts:168

The preferredModels array includes all previously used models from the cache (line 168), but the cache structure change means endpoints can now accumulate without bound (see related issue with line 179). This could result in preferredModels becoming very large over the lifetime of a conversation, potentially impacting router performance. Consider limiting the number of preferred models or implementing proper cache management.

		const cachedModels = this._autoModelCache.get(conversationId)?.endpoints.map(e => e.model) || [];
		const preferredModels = [...new Set([reserveToken.selected_model, ...cachedModels])];

Copilot · 2026-01-12T10:24:44Z

src/platform/endpoint/node/automodeService.ts

+		if (chatRequest?.prompt.trim().length) {
+			const routedModel = await this._routerDecisionFetcher.getRoutedModel(chatRequest.prompt.trim(), availableModels, preferredModels);
+			selectedModel = knownEndpoints.find(e => e.model === routedModel);
+		}
+		if (!selectedModel) {
+			selectedModel = knownEndpoints.find(e => e.model === reserveToken.selected_model) || knownEndpoints[0];
+		}


The router call will throw an error on network failures, but there's no try-catch around this code. When the router fails, the error is caught in getRoutedModel and re-thrown, which will propagate up and potentially break the entire endpoint resolution. Consider wrapping the router call in a try-catch block to gracefully fall back to the default model selection (reserveToken.selected_model) when the router is unavailable or fails.

This issue also appears in the following locations of the same file:

line 171

Copilot · 2026-01-12T10:24:45Z

src/platform/endpoint/node/automodeService.ts

 		const autoEndpoint = this._instantiationService.createInstance(AutoChatEndpoint, selectedModel, reserveToken.session_token, reserveToken.discounted_costs?.[selectedModel.model] || 0, this._calculateDiscountRange(reserveToken.discounted_costs));
-		this._autoModelCache.set(conversationId, { endpoint: autoEndpoint, tokenBank: reserveTokenBank });
+		const existingEndpoints = this._autoModelCache.get(conversationId)?.endpoints || [];
+		const newEndpoints = existingEndpoints.includes(autoEndpoint) ? existingEndpoints : [...existingEndpoints, autoEndpoint];
+		this._autoModelCache.set(conversationId, { endpoints: newEndpoints, tokenBank: reserveTokenBank });


The endpoint comparison using includes at line 179 will always be false because AutoChatEndpoint instances are created fresh each time. This means newEndpoints will always grow unboundedly in the cache, leading to a memory leak. The cache should either: (1) compare by model name rather than instance identity, (2) reuse existing endpoint instances when the model hasn't changed, or (3) use a different cache structure that maps model names to endpoints.

This issue also appears in the following locations of the same file:

line 167

See below for a potential fix:

const existingEndpoints = this._autoModelCache.get(conversationId)?.endpoints || []; const existingEndpointForModel = existingEndpoints.find(e => e.model === selectedModel.model); const autoEndpoint = existingEndpointForModel instanceof AutoChatEndpoint ? existingEndpointForModel : this._instantiationService.createInstance( AutoChatEndpoint, selectedModel, reserveToken.session_token, reserveToken.discounted_costs?.[selectedModel.model] || 0, this._calculateDiscountRange(reserveToken.discounted_costs) ); const newEndpoints = existingEndpointForModel ? existingEndpoints : [...existingEndpoints, autoEndpoint];

Copilot · 2026-01-12T10:24:45Z

src/platform/endpoint/node/routerDecisionFetcher.ts

+
+			const result: RouterDecisionResponse = await response.json();
+
+			this._logService.trace(`Reasoning classifier prediction: ${result.predicted_label}, model: ${result.chosen_model} (confidence: ${(result.confidence * 100).toFixed(1)}%, scores: needs_reasoning=${(result.scores.needs_reasoning * 100).toFixed(1)}%, no_reasoning=${(result.scores.no_reasoning * 100).toFixed(1)}%) (latency_ms: ${result.latency_ms}, candidate models: ${result.candidate_models.join(', ')}, preferred models: ${preferredModels.join(', ')})`);


When preferredModels is an empty array and passed to join, the log message at line 55 will include "preferred models: " with nothing after it. Consider handling the case where preferredModels is empty to make the log message clearer, such as showing "preferred models: none" or omitting this part when empty.

This issue also appears in the following locations of the same file:

line 50

Suggested change

this._logService.trace(`Reasoning classifier prediction: ${result.predicted_label}, model: ${result.chosen_model} (confidence: ${(result.confidence * 100).toFixed(1)}%, scores: needs_reasoning=${(result.scores.needs_reasoning * 100).toFixed(1)}%, no_reasoning=${(result.scores.no_reasoning * 100).toFixed(1)}%) (latency_ms: ${result.latency_ms}, candidate models: ${result.candidate_models.join(', ')}, preferred models: ${preferredModels.join(', ')})`);

const preferredModelsDescription = preferredModels.length > 0 ? preferredModels.join(', ') : 'none';

this._logService.trace(`Reasoning classifier prediction: ${result.predicted_label}, model: ${result.chosen_model} (confidence: ${(result.confidence * 100).toFixed(1)}%, scores: needs_reasoning=${(result.scores.needs_reasoning * 100).toFixed(1)}%, no_reasoning=${(result.scores.no_reasoning * 100).toFixed(1)}%) (latency_ms: ${result.latency_ms}, candidate models: ${result.candidate_models.join(', ')}, preferred models: ${preferredModelsDescription})`);

Copilot · 2026-01-12T10:24:45Z

src/platform/endpoint/node/routerDecisionFetcher.ts

+export class RouterDecisionFetcher extends Disposable {
+	constructor(
+		private readonly _fetcherService: IFetcherService,
+		private readonly _logService: ILogService
+	) {
+		super();
+	}
+
+	async getRoutedModel(query: string, availableModels: string[], preferredModels: string[]): Promise<string> {
+		try {
+			const response = await this._fetcherService.fetch(ROUTER_API_URL, {
+				method: 'POST',
+				headers: {
+					'Content-Type': 'application/json'
+				},
+				body: JSON.stringify({ prompt: query, available_models: availableModels, preferred_models: preferredModels })
+			});
+
+			if (!response.ok) {
+				throw new Error(`Reasoning classifier API request failed: ${response.statusText}`);
+			}
+
+			const result: RouterDecisionResponse = await response.json();
+
+			this._logService.trace(`Reasoning classifier prediction: ${result.predicted_label}, model: ${result.chosen_model} (confidence: ${(result.confidence * 100).toFixed(1)}%, scores: needs_reasoning=${(result.scores.needs_reasoning * 100).toFixed(1)}%, no_reasoning=${(result.scores.no_reasoning * 100).toFixed(1)}%) (latency_ms: ${result.latency_ms}, candidate models: ${result.candidate_models.join(', ')}, preferred models: ${preferredModels.join(', ')})`);
+
+			return result.chosen_model;
+		} catch (error) {
+			this._logService.error('Reasoning classification failed', error);
+			throw error;
+		}
+	}
+}


The RouterDecisionFetcher class lacks test coverage. Given that there is a test file for similar endpoint functionality (copilotChatEndpoint.spec.ts) and this is a critical path for model selection, tests should be added to verify: (1) successful routing responses, (2) handling of API failures, (3) proper model selection from the response, and (4) logging behavior.

Copilot · 2026-01-12T10:24:45Z

src/platform/endpoint/node/routerDecisionFetcher.ts

+import { IFetcherService } from '../../networking/common/fetcherService';
+
+// Remote reasoning classifier configuration
+export const ROUTER_API_URL = 'https://gh-model-router-v1.yellowforest-598004f3.westus3.azurecontainerapps.io/predict';


The router API URL is hardcoded and publicly exposed. Consider whether this endpoint should have authentication or if the URL should be configurable through environment variables or settings, especially for testing and deployment flexibility. This would also align with patterns used elsewhere in the codebase for external service endpoints.

Suggested change

export const ROUTER_API_URL = 'https://gh-model-router-v1.yellowforest-598004f3.westus3.azurecontainerapps.io/predict';

export const ROUTER_API_URL = process.env.GH_MODEL_ROUTER_API_URL ?? 'https://gh-model-router-v1.yellowforest-598004f3.westus3.azurecontainerapps.io/predict';

TylerLeonhardt · 2026-01-12T23:11:17Z

src/platform/endpoint/node/routerDecisionFetcher.ts

+import { IFetcherService } from '../../networking/common/fetcherService';
+
+// Remote reasoning classifier configuration
+export const ROUTER_API_URL = 'https://gh-model-router-v1.yellowforest-598004f3.westus3.azurecontainerapps.io/predict';


let's bring this in via ExP

TylerLeonhardt · 2026-01-13T18:13:36Z

src/platform/endpoint/node/automodeService.ts

+		if (chatRequest?.prompt.trim().length) {
+			const routedModel = await this._routerDecisionFetcher.getRoutedModel(chatRequest.prompt.trim(), availableModels, preferredModels);


Suggested change

if (chatRequest?.prompt.trim().length) {

const routedModel = await this._routerDecisionFetcher.getRoutedModel(chatRequest.prompt.trim(), availableModels, preferredModels);

const prompt = chatRequest?.prompt.trim();

if (prompt?.length) {

const routedModel = await this._routerDecisionFetcher.getRoutedModel(prompt, availableModels, preferredModels);

TylerLeonhardt · 2026-01-13T19:38:38Z

src/platform/endpoint/node/routerDecisionFetcher.ts

+import { IFetcherService } from '../../networking/common/fetcherService';
+
+// Remote reasoning classifier configuration
+export const ROUTER_API_URL = '';


We shouldn't have this and instead get the url from ExP

Sid200026 added 2 commits January 7, 2026 13:08

chore(): prototype for model router

62e356c

chore(): use cached and selected models as preferred models

05bee3e

Copilot AI review requested due to automatic review settings January 12, 2026 10:20

Copilot started reviewing on behalf of Sid200026 January 12, 2026 10:20 View session

vs-code-engineering bot assigned aeschli Jan 12, 2026

vs-code-engineering bot added the triage-needed label Jan 12, 2026

Copilot AI reviewed Jan 12, 2026

View reviewed changes

aeschli assigned lramos15 and unassigned aeschli Jan 12, 2026

lramos15 assigned TylerLeonhardt and sbatten and unassigned lramos15 Jan 12, 2026

lramos15 removed the triage-needed label Jan 12, 2026

TylerLeonhardt reviewed Jan 12, 2026

View reviewed changes

shengyfu mentioned this pull request Jan 13, 2026

Prototype: model_router_v0 with remote classifier service #2400

Closed

chore(): remove dummy url

6b07100

TylerLeonhardt reviewed Jan 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore(auto mode): prototype for model router #2788

chore(auto mode): prototype for model router #2788

Uh oh!

Sid200026 commented Jan 12, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 12, 2026

Uh oh!

Copilot AI Jan 12, 2026

Uh oh!

Copilot AI Jan 12, 2026

Uh oh!

Copilot AI Jan 12, 2026

Uh oh!

Copilot AI Jan 12, 2026

Uh oh!

TylerLeonhardt Jan 12, 2026

Uh oh!

TylerLeonhardt Jan 13, 2026

Uh oh!

TylerLeonhardt Jan 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants


		const result: RouterDecisionResponse = await response.json();

		this._logService.trace(`Reasoning classifier prediction: ${result.predicted_label}, model: ${result.chosen_model} (confidence: ${(result.confidence * 100).toFixed(1)}%, scores: needs_reasoning=${(result.scores.needs_reasoning * 100).toFixed(1)}%, no_reasoning=${(result.scores.no_reasoning * 100).toFixed(1)}%) (latency_ms: ${result.latency_ms}, candidate models: ${result.candidate_models.join(', ')}, preferred models: ${preferredModels.join(', ')})`);

	this._logService.trace(`Reasoning classifier prediction: ${result.predicted_label}, model: ${result.chosen_model} (confidence: ${(result.confidence * 100).toFixed(1)}%, scores: needs_reasoning=${(result.scores.needs_reasoning * 100).toFixed(1)}%, no_reasoning=${(result.scores.no_reasoning * 100).toFixed(1)}%) (latency_ms: ${result.latency_ms}, candidate models: ${result.candidate_models.join(', ')}, preferred models: ${preferredModels.join(', ')})`);
	const preferredModelsDescription = preferredModels.length > 0 ? preferredModels.join(', ') : 'none';
	this._logService.trace(`Reasoning classifier prediction: ${result.predicted_label}, model: ${result.chosen_model} (confidence: ${(result.confidence * 100).toFixed(1)}%, scores: needs_reasoning=${(result.scores.needs_reasoning * 100).toFixed(1)}%, no_reasoning=${(result.scores.no_reasoning * 100).toFixed(1)}%) (latency_ms: ${result.latency_ms}, candidate models: ${result.candidate_models.join(', ')}, preferred models: ${preferredModelsDescription})`);

	export const ROUTER_API_URL = 'https://gh-model-router-v1.yellowforest-598004f3.westus3.azurecontainerapps.io/predict';
	export const ROUTER_API_URL = process.env.GH_MODEL_ROUTER_API_URL ?? 'https://gh-model-router-v1.yellowforest-598004f3.westus3.azurecontainerapps.io/predict';

		if (chatRequest?.prompt.trim().length) {
		const routedModel = await this._routerDecisionFetcher.getRoutedModel(chatRequest.prompt.trim(), availableModels, preferredModels);

-		if (chatRequest?.prompt.trim().length) {
-			const routedModel = await this._routerDecisionFetcher.getRoutedModel(chatRequest.prompt.trim(), availableModels, preferredModels);
+		const prompt = chatRequest?.prompt.trim();
+		if (prompt?.length) {
+			const routedModel = await this._routerDecisionFetcher.getRoutedModel(prompt, availableModels, preferredModels);

chore(auto mode): prototype for model router #2788

Are you sure you want to change the base?

chore(auto mode): prototype for model router #2788

Uh oh!

Conversation

Sid200026 commented Jan 12, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

TylerLeonhardt Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

TylerLeonhardt Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

TylerLeonhardt Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants