Skip to content

Conversation

@Sid200026
Copy link
Member

No description provided.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a prototype model router that uses an external classification API to intelligently select which AI model should handle a user query based on whether it requires reasoning capabilities.

Changes:

  • Adds RouterDecisionFetcher class that calls an external router API to determine the best model for a given prompt
  • Integrates the router into AutomodeService to use routing decisions during endpoint resolution
  • Changes cache structure from storing a single endpoint to an array of endpoints per conversation

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
src/platform/endpoint/node/routerDecisionFetcher.ts New class that fetches model routing decisions from an external API based on query requirements
src/platform/endpoint/node/automodeService.ts Integrates router decision fetcher, updates cache to store multiple endpoints, and uses router to select models
Comments suppressed due to low confidence (3)

src/platform/endpoint/node/routerDecisionFetcher.ts:60

  • The error message refers to "Reasoning classifier" in multiple places (lines 50, 55, 59) but the class is named RouterDecisionFetcher and the purpose is model routing. The terminology should be consistent - either rename references to use "model router" or clarify the relationship between "reasoning classification" and "routing" in the class documentation.
				throw new Error(`Reasoning classifier API request failed: ${response.statusText}`);
			}

			const result: RouterDecisionResponse = await response.json();

			this._logService.trace(`Reasoning classifier prediction: ${result.predicted_label}, model: ${result.chosen_model} (confidence: ${(result.confidence * 100).toFixed(1)}%, scores: needs_reasoning=${(result.scores.needs_reasoning * 100).toFixed(1)}%, no_reasoning=${(result.scores.no_reasoning * 100).toFixed(1)}%) (latency_ms: ${result.latency_ms}, candidate models: ${result.candidate_models.join(', ')}, preferred models: ${preferredModels.join(', ')})`);

			return result.chosen_model;
		} catch (error) {
			this._logService.error('Reasoning classification failed', error);
			throw error;

src/platform/endpoint/node/automodeService.ts:173

  • The API call does not validate that the chosen_model from the router response exists in the knownEndpoints array before attempting to use it. If the router returns a model that's not in knownEndpoints, selectedModel will be undefined and the fallback at line 174-175 will catch it. However, it would be more robust to validate that the routedModel is actually available in knownEndpoints before setting selectedModel, to avoid potential issues if the fallback logic changes.
			const routedModel = await this._routerDecisionFetcher.getRoutedModel(chatRequest.prompt.trim(), availableModels, preferredModels);
			selectedModel = knownEndpoints.find(e => e.model === routedModel);
		}

src/platform/endpoint/node/automodeService.ts:168

  • The preferredModels array includes all previously used models from the cache (line 168), but the cache structure change means endpoints can now accumulate without bound (see related issue with line 179). This could result in preferredModels becoming very large over the lifetime of a conversation, potentially impacting router performance. Consider limiting the number of preferred models or implementing proper cache management.
		const cachedModels = this._autoModelCache.get(conversationId)?.endpoints.map(e => e.model) || [];
		const preferredModels = [...new Set([reserveToken.selected_model, ...cachedModels])];

Comment on lines +170 to +176
if (chatRequest?.prompt.trim().length) {
const routedModel = await this._routerDecisionFetcher.getRoutedModel(chatRequest.prompt.trim(), availableModels, preferredModels);
selectedModel = knownEndpoints.find(e => e.model === routedModel);
}
if (!selectedModel) {
selectedModel = knownEndpoints.find(e => e.model === reserveToken.selected_model) || knownEndpoints[0];
}
Copy link

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The router call will throw an error on network failures, but there's no try-catch around this code. When the router fails, the error is caught in getRoutedModel and re-thrown, which will propagate up and potentially break the entire endpoint resolution. Consider wrapping the router call in a try-catch block to gracefully fall back to the default model selection (reserveToken.selected_model) when the router is unavailable or fails.

This issue also appears in the following locations of the same file:

  • line 171

Copilot uses AI. Check for mistakes.
Comment on lines 171 to +180
const autoEndpoint = this._instantiationService.createInstance(AutoChatEndpoint, selectedModel, reserveToken.session_token, reserveToken.discounted_costs?.[selectedModel.model] || 0, this._calculateDiscountRange(reserveToken.discounted_costs));
this._autoModelCache.set(conversationId, { endpoint: autoEndpoint, tokenBank: reserveTokenBank });
const existingEndpoints = this._autoModelCache.get(conversationId)?.endpoints || [];
const newEndpoints = existingEndpoints.includes(autoEndpoint) ? existingEndpoints : [...existingEndpoints, autoEndpoint];
this._autoModelCache.set(conversationId, { endpoints: newEndpoints, tokenBank: reserveTokenBank });
Copy link

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The endpoint comparison using includes at line 179 will always be false because AutoChatEndpoint instances are created fresh each time. This means newEndpoints will always grow unboundedly in the cache, leading to a memory leak. The cache should either: (1) compare by model name rather than instance identity, (2) reuse existing endpoint instances when the model hasn't changed, or (3) use a different cache structure that maps model names to endpoints.

This issue also appears in the following locations of the same file:

  • line 167

See below for a potential fix:

		const existingEndpoints = this._autoModelCache.get(conversationId)?.endpoints || [];
		const existingEndpointForModel = existingEndpoints.find(e => e.model === selectedModel.model);
		const autoEndpoint = existingEndpointForModel instanceof AutoChatEndpoint
			? existingEndpointForModel
			: this._instantiationService.createInstance(
				AutoChatEndpoint,
				selectedModel,
				reserveToken.session_token,
				reserveToken.discounted_costs?.[selectedModel.model] || 0,
				this._calculateDiscountRange(reserveToken.discounted_costs)
			);
		const newEndpoints = existingEndpointForModel ? existingEndpoints : [...existingEndpoints, autoEndpoint];

Copilot uses AI. Check for mistakes.

const result: RouterDecisionResponse = await response.json();

this._logService.trace(`Reasoning classifier prediction: ${result.predicted_label}, model: ${result.chosen_model} (confidence: ${(result.confidence * 100).toFixed(1)}%, scores: needs_reasoning=${(result.scores.needs_reasoning * 100).toFixed(1)}%, no_reasoning=${(result.scores.no_reasoning * 100).toFixed(1)}%) (latency_ms: ${result.latency_ms}, candidate models: ${result.candidate_models.join(', ')}, preferred models: ${preferredModels.join(', ')})`);
Copy link

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When preferredModels is an empty array and passed to join, the log message at line 55 will include "preferred models: " with nothing after it. Consider handling the case where preferredModels is empty to make the log message clearer, such as showing "preferred models: none" or omitting this part when empty.

This issue also appears in the following locations of the same file:

  • line 50
Suggested change
this._logService.trace(`Reasoning classifier prediction: ${result.predicted_label}, model: ${result.chosen_model} (confidence: ${(result.confidence * 100).toFixed(1)}%, scores: needs_reasoning=${(result.scores.needs_reasoning * 100).toFixed(1)}%, no_reasoning=${(result.scores.no_reasoning * 100).toFixed(1)}%) (latency_ms: ${result.latency_ms}, candidate models: ${result.candidate_models.join(', ')}, preferred models: ${preferredModels.join(', ')})`);
const preferredModelsDescription = preferredModels.length > 0 ? preferredModels.join(', ') : 'none';
this._logService.trace(`Reasoning classifier prediction: ${result.predicted_label}, model: ${result.chosen_model} (confidence: ${(result.confidence * 100).toFixed(1)}%, scores: needs_reasoning=${(result.scores.needs_reasoning * 100).toFixed(1)}%, no_reasoning=${(result.scores.no_reasoning * 100).toFixed(1)}%) (latency_ms: ${result.latency_ms}, candidate models: ${result.candidate_models.join(', ')}, preferred models: ${preferredModelsDescription})`);

Copilot uses AI. Check for mistakes.
Comment on lines +31 to +63
export class RouterDecisionFetcher extends Disposable {
constructor(
private readonly _fetcherService: IFetcherService,
private readonly _logService: ILogService
) {
super();
}

async getRoutedModel(query: string, availableModels: string[], preferredModels: string[]): Promise<string> {
try {
const response = await this._fetcherService.fetch(ROUTER_API_URL, {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({ prompt: query, available_models: availableModels, preferred_models: preferredModels })
});

if (!response.ok) {
throw new Error(`Reasoning classifier API request failed: ${response.statusText}`);
}

const result: RouterDecisionResponse = await response.json();

this._logService.trace(`Reasoning classifier prediction: ${result.predicted_label}, model: ${result.chosen_model} (confidence: ${(result.confidence * 100).toFixed(1)}%, scores: needs_reasoning=${(result.scores.needs_reasoning * 100).toFixed(1)}%, no_reasoning=${(result.scores.no_reasoning * 100).toFixed(1)}%) (latency_ms: ${result.latency_ms}, candidate models: ${result.candidate_models.join(', ')}, preferred models: ${preferredModels.join(', ')})`);

return result.chosen_model;
} catch (error) {
this._logService.error('Reasoning classification failed', error);
throw error;
}
}
}
Copy link

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RouterDecisionFetcher class lacks test coverage. Given that there is a test file for similar endpoint functionality (copilotChatEndpoint.spec.ts) and this is a critical path for model selection, tests should be added to verify: (1) successful routing responses, (2) handling of API failures, (3) proper model selection from the response, and (4) logging behavior.

Copilot uses AI. Check for mistakes.
import { IFetcherService } from '../../networking/common/fetcherService';

// Remote reasoning classifier configuration
export const ROUTER_API_URL = 'https://gh-model-router-v1.yellowforest-598004f3.westus3.azurecontainerapps.io/predict';
Copy link

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The router API URL is hardcoded and publicly exposed. Consider whether this endpoint should have authentication or if the URL should be configurable through environment variables or settings, especially for testing and deployment flexibility. This would also align with patterns used elsewhere in the codebase for external service endpoints.

Suggested change
export const ROUTER_API_URL = 'https://gh-model-router-v1.yellowforest-598004f3.westus3.azurecontainerapps.io/predict';
export const ROUTER_API_URL = process.env.GH_MODEL_ROUTER_API_URL ?? 'https://gh-model-router-v1.yellowforest-598004f3.westus3.azurecontainerapps.io/predict';

Copilot uses AI. Check for mistakes.
@aeschli aeschli assigned lramos15 and unassigned aeschli Jan 12, 2026
@lramos15 lramos15 assigned TylerLeonhardt and sbatten and unassigned lramos15 Jan 12, 2026
import { IFetcherService } from '../../networking/common/fetcherService';

// Remote reasoning classifier configuration
export const ROUTER_API_URL = 'https://gh-model-router-v1.yellowforest-598004f3.westus3.azurecontainerapps.io/predict';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's bring this in via ExP

Comment on lines +170 to +171
if (chatRequest?.prompt.trim().length) {
const routedModel = await this._routerDecisionFetcher.getRoutedModel(chatRequest.prompt.trim(), availableModels, preferredModels);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (chatRequest?.prompt.trim().length) {
const routedModel = await this._routerDecisionFetcher.getRoutedModel(chatRequest.prompt.trim(), availableModels, preferredModels);
const prompt = chatRequest?.prompt.trim();
if (prompt?.length) {
const routedModel = await this._routerDecisionFetcher.getRoutedModel(prompt, availableModels, preferredModels);

import { IFetcherService } from '../../networking/common/fetcherService';

// Remote reasoning classifier configuration
export const ROUTER_API_URL = '';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't have this and instead get the url from ExP

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants