perf(otel-sqlite): add stored session_id column to eliminate full-table scans#314900
Open
makroumi wants to merge 3 commits intomicrosoft:mainfrom
Open
perf(otel-sqlite): add stored session_id column to eliminate full-table scans#314900makroumi wants to merge 3 commits intomicrosoft:mainfrom
makroumi wants to merge 3 commits intomicrosoft:mainfrom
Conversation
…le scans SCHEMA_VERSION 1 -> 2. Problem: Every session-scoped query and cleanup operation performed a full table scan on spans. COALESCE(conversation_id, chat_session_id) in WHERE and GROUP BY clauses prevents index usage; SQLite cannot use an index on a function-wrapped column. OR across two indexed columns (conversation_id, chat_session_id) forces the query planner to evaluate both indexes and merge results, also degrading to a scan at scale. Changes: - Add session_id TEXT GENERATED ALWAYS AS (COALESCE(conversation_id, chat_session_id)) STORED to spans. SQLite computes and stores this at write time, zero application-layer changes to INSERT paths. - Add idx_spans_session ON spans(session_id, start_time_ms) as a composite index covering both the session lookup and the ORDER BY start_time_ms present in getSpansByConversationId. - Rewrite getSpansByConversationId: WHERE session_id = ? (index seek, was full scan via OR). - Rewrite getTraceIds: WHERE session_id = ? (index seek, was full scan via OR). - Rewrite sessions view: GROUP BY session_id, WHERE session_id IS NOT NULL (index seek, was COALESCE in GROUP BY forcing full scan). - Rewrite _cleanupOnStartup: GROUP BY session_id in both the cutoff subquery and the DELETE subquery (index seek, was COALESCE full scan). - Replace NOT IN subquery with NOT EXISTS in _cleanupOnStartup. Correctness of NOT IN depended on an implicit NULL guard inside the subquery. NOT EXISTS is NULL-safe by structure. - v1 to v2 migration: ALTER TABLE RENAME, CREATE TABLE with generated column, INSERT SELECT with explicit 24-column list excluding session_id, DROP old table, recreate indexes, DROP and recreate sessions view. SQLite does not support ADD COLUMN for generated columns via ALTER TABLE. Verified: - EXPLAIN QUERY PLAN confirms SEARCH spans USING INDEX idx_spans_session for all session-scoped queries (previously SCAN spans). - Migration preserves row count and correctly computes session_id for conversation_id rows, chat_session_id fallback rows, and null rows. - sessions view returns correct aggregates post-migration. - Zero changes to public API surface or INSERT parameter lists.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR improves performance of the Copilot extension’s OTelSqliteStore SQLite queries by introducing a persisted session_id generated column and a composite index, allowing session-scoped lookups and aggregations to use indexed seeks instead of full-table scans.
Changes:
- Bump schema to v2 and add
session_id TEXT GENERATED ALWAYS AS (COALESCE(conversation_id, chat_session_id)) STOREDplusidx_spans_session(session_id, start_time_ms). - Rewrite session-scoped queries (
getSpansByConversationId,getTraceIds) and thesessionsview to usesession_id. - Update startup cleanup logic to group by
session_idand switch the retention delete toNOT EXISTS.
Author
|
@microsoft-github-policy-service agree |
- Use MAX(version) and single-row enforcement to prevent schema_version accumulation (INSERT OR REPLACE with different PK values creates new rows) - Use CREATE new / DROP old / RENAME pattern to preserve FK integrity (ALTER TABLE RENAME breaks FKs pointing at the old table name) - Wrap migrations in BEGIN IMMEDIATE / COMMIT / ROLLBACK for atomicity
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Every session-scoped query in
OTelSqliteStoreperformed a full table scan onspans.Root causes confirmed with
EXPLAIN QUERY PLAN:COALESCE(conversation_id, chat_session_id)inWHEREandGROUP BY; SQLite cannot use an index on a function-wrapped column. Bothidx_spans_conversationandidx_spans_chat_sessionwere defined but unreachable for these queries.ORacross two columns (conversation_id = ? OR chat_session_id = ?): the query planner cannot satisfy this with a single index seek.Affected:
getSpansByConversationId,getTraceIds, thesessionsview (used bygetSessionsandgetSessionsSince), and both cleanup subqueries in_cleanupOnStartup.Solution
Add
session_id TEXT GENERATED ALWAYS AS (COALESCE(conversation_id, chat_session_id)) STOREDtospans. SQLite computes and persists this at write time. No changes to INSERT paths or call sites.Add
idx_spans_session ON spans(session_id, start_time_ms)— composite index covering the equality lookup and theORDER BY start_time_msused ingetSpansByConversationId.Changes
getSpansByConversationIdWHERE conversation_id = ? OR chat_session_id = ?WHERE session_id = ?getTraceIdsWHERE conversation_id = ? OR chat_session_id = ?WHERE session_id = ?sessionsviewGROUP BY COALESCE(conversation_id, chat_session_id)GROUP BY session_id_cleanupOnStartupcutoffGROUP BY COALESCE(...)GROUP BY session_id_cleanupOnStartupDELETENOT INsubqueryNOT EXISTS— NULL-safe by structureSCHEMA_VERSION12Migration
v1 → v2 uses the standard SQLite table rebuild pattern: rename existing table → recreate with generated column →
INSERT INTO ... SELECTwith explicit 24-column list (session_idexcluded, it isGENERATED ALWAYS) → drop old table → recreate indexes → drop and recreatesessionsview. Required because SQLite does not supportALTER TABLE ADD COLUMNfor generated columns.Verification
Migration verified: row count preserved,
session_idcorrectly resolves toconversation_idwhen present,chat_session_idas fallback,nullwhen both are absent.sessionsview aggregates correct post-migration.