Summary
Introduce a dedicated DB performance test project (new .csproj, similar structure to integration tests) and CI workflow job to benchmark query performance against seeded high-cardinality data.
This provides repeatable baseline + re-measure loops for index strategy decisions across PostgreSQL and SQLite.
Motivation
We need measurable evidence in CI to:
- capture baseline timings,
- validate index changes improve target endpoints,
- prevent accidental regressions.
Proposal
1) New project
Create a new project alongside integration tests:
src/Agoda.DevExTelemetry.DbPerfTests/Agoda.DevExTelemetry.DbPerfTests.csproj
Project intent:
- host app using
WebApplicationFactory<Program>,
- seed large deterministic + randomized dataset,
- wait for seed completion signal,
- call selected API endpoints,
- record timing distributions and compare to baseline.
2) Data seeding approach (DI-driven)
Inject startup seeding service via test host DI that:
- writes dataset through repository layer,
- combines deterministic + randomized data,
- creates realistic cardinality across:
- project/repository/branch/platform,
- local vs CI execution environment,
- test runners,
- date windows,
- build categories + reload types,
- uses fixed random seed for reproducibility.
3) Readiness gate
Do not rely only on health startup.
Use two gates:
- app startup health check,
- explicit seed completion signal (test-only seed-status endpoint or in-memory completion marker exposed to test harness).
4) Endpoint benchmark set (initial)
GET /api/dashboard/build-metrics
GET /api/dashboard/test-runs
GET /api/dashboard/api-build-summary
GET /api/dashboard/clientside-build-summary
GET /api/dashboard/test-run-summary
Run each with filter profiles matching real query shapes (date ranges + project/repo/branch/platform/env).
5) Engine matrix
Run tests for both:
This is required because index behavior and planner behavior diverge by engine.
Seed volume contract (initial targets)
- BuildMetrics: ~100k rows
- TestRuns: ~50k rows
- TestCases: ~500k rows (avg ~10/test run)
- RawPayloads: ~50k rows
Cardinality targets:
- ~200 projects
- ~120 repositories
- ~400 branches
- representative platform/environment/test-runner spread
Threshold/regression strategy
Phase 1: stabilization (non-blocking)
- Run perf workflow nightly / on-demand.
- Publish artifacts only.
Phase 2: regression gating
- Use relative regression checks vs baseline (e.g., p95 must not regress >20%).
- Avoid hardcoded absolute ms thresholds on shared runners.
CI placement
Start as dedicated workflow:
workflow_dispatch
- nightly schedule
After variance/runtime analysis, promote selected checks to PR gating if stable and practical.
Artifact format and consumption
Primary artifact: versioned JSON schema for machine comparison.
Minimum fields:
- endpoint
- filter profile
- engine
- sample count
- p50 / p95 / p99
- mean / stddev
- baseline delta (%)
Generate Markdown summary from JSON for human review in workflow output/PR comments.
Suggested rollout plan
- Capture baseline query timings for key endpoints.
- Add PostgreSQL composites first (and optional partial indexes if volume justifies).
- Add minimal SQLite composites.
- Re-measure and keep only indexes with measurable benefit.
Acceptance criteria (initial)
- New perf test
.csproj committed and runnable locally.
- Deterministic+randomized seed generator implemented with target volume/cardinality.
- Seed completion gating implemented (separate from app startup health).
- Dedicated CI perf workflow runs SQLite + PostgreSQL matrix and uploads JSON + Markdown artifacts.
- At least one relative regression assertion per critical endpoint (after stabilization phase).
- Docs added for local execution and artifact interpretation.
Additional implementation notes (important)
A) Seeding performance strategy
Given target volume (~700k+ rows total), do not rely on naive row-by-row EF inserts.
- PostgreSQL: use bulk-friendly ingestion strategy (batched insert or COPY-style path where feasible).
- SQLite: prefer high-throughput batched inserts and consider prebuilt seeded DB artifact option for local fast loop.
- Keep repository-layer shape for realism where practical, but optimize seed path explicitly to avoid CI timeout risk.
B) Warm-up phase before measurement
Before capturing benchmark timings:
- run warm-up calls for representative endpoint mix (time- or iteration-based),
- ensure app startup/JIT/connection pool/ORM model initialization effects are not included in measured distribution.
C) Benchmark execution tooling
Do not reinvent stats collection.
Preferred options:
- .NET-native harness with robust distribution capture,
- or dedicated load test tool (e.g., NBomber/k6) integrated into workflow.
Whichever path is chosen must output the agreed JSON schema (p50/p95/p99/mean/stddev/delta).
D) Future PR gating reliability
When moving from nightly to PR-gating, use same-job A/B comparison where possible:
- run baseline (main),
- run candidate (PR),
- compare deltas on same runner.
This reduces false positives from noisy shared CI hardware.
E) Index keep/remove decision criteria
Index retention is based on both sides:
- read-time improvement on target endpoints,
- write-time/ingest penalty overhead from maintaining additional indexes.
Do not keep indexes that improve read latency marginally while significantly degrading ingest throughput.
Summary
Introduce a dedicated DB performance test project (new
.csproj, similar structure to integration tests) and CI workflow job to benchmark query performance against seeded high-cardinality data.This provides repeatable baseline + re-measure loops for index strategy decisions across PostgreSQL and SQLite.
Motivation
We need measurable evidence in CI to:
Proposal
1) New project
Create a new project alongside integration tests:
src/Agoda.DevExTelemetry.DbPerfTests/Agoda.DevExTelemetry.DbPerfTests.csprojProject intent:
WebApplicationFactory<Program>,2) Data seeding approach (DI-driven)
Inject startup seeding service via test host DI that:
3) Readiness gate
Do not rely only on health startup.
Use two gates:
4) Endpoint benchmark set (initial)
GET /api/dashboard/build-metricsGET /api/dashboard/test-runsGET /api/dashboard/api-build-summaryGET /api/dashboard/clientside-build-summaryGET /api/dashboard/test-run-summaryRun each with filter profiles matching real query shapes (date ranges + project/repo/branch/platform/env).
5) Engine matrix
Run tests for both:
This is required because index behavior and planner behavior diverge by engine.
Seed volume contract (initial targets)
Cardinality targets:
Threshold/regression strategy
Phase 1: stabilization (non-blocking)
Phase 2: regression gating
CI placement
Start as dedicated workflow:
workflow_dispatchAfter variance/runtime analysis, promote selected checks to PR gating if stable and practical.
Artifact format and consumption
Primary artifact: versioned JSON schema for machine comparison.
Minimum fields:
Generate Markdown summary from JSON for human review in workflow output/PR comments.
Suggested rollout plan
Acceptance criteria (initial)
.csprojcommitted and runnable locally.Additional implementation notes (important)
A) Seeding performance strategy
Given target volume (~700k+ rows total), do not rely on naive row-by-row EF inserts.
B) Warm-up phase before measurement
Before capturing benchmark timings:
C) Benchmark execution tooling
Do not reinvent stats collection.
Preferred options:
Whichever path is chosen must output the agreed JSON schema (p50/p95/p99/mean/stddev/delta).
D) Future PR gating reliability
When moving from nightly to PR-gating, use same-job A/B comparison where possible:
This reduces false positives from noisy shared CI hardware.
E) Index keep/remove decision criteria
Index retention is based on both sides:
Do not keep indexes that improve read latency marginally while significantly degrading ingest throughput.