Skip to content

Proposal: add dedicated DB performance test project + CI benchmark job #16

@joeldickson

Description

@joeldickson

Summary

Introduce a dedicated DB performance test project (new .csproj, similar structure to integration tests) and CI workflow job to benchmark query performance against seeded high-cardinality data.

This provides repeatable baseline + re-measure loops for index strategy decisions across PostgreSQL and SQLite.


Motivation

We need measurable evidence in CI to:

  • capture baseline timings,
  • validate index changes improve target endpoints,
  • prevent accidental regressions.

Proposal

1) New project

Create a new project alongside integration tests:

  • src/Agoda.DevExTelemetry.DbPerfTests/Agoda.DevExTelemetry.DbPerfTests.csproj

Project intent:

  • host app using WebApplicationFactory<Program>,
  • seed large deterministic + randomized dataset,
  • wait for seed completion signal,
  • call selected API endpoints,
  • record timing distributions and compare to baseline.

2) Data seeding approach (DI-driven)

Inject startup seeding service via test host DI that:

  • writes dataset through repository layer,
  • combines deterministic + randomized data,
  • creates realistic cardinality across:
    • project/repository/branch/platform,
    • local vs CI execution environment,
    • test runners,
    • date windows,
    • build categories + reload types,
  • uses fixed random seed for reproducibility.

3) Readiness gate

Do not rely only on health startup.

Use two gates:

  1. app startup health check,
  2. explicit seed completion signal (test-only seed-status endpoint or in-memory completion marker exposed to test harness).

4) Endpoint benchmark set (initial)

  • GET /api/dashboard/build-metrics
  • GET /api/dashboard/test-runs
  • GET /api/dashboard/api-build-summary
  • GET /api/dashboard/clientside-build-summary
  • GET /api/dashboard/test-run-summary

Run each with filter profiles matching real query shapes (date ranges + project/repo/branch/platform/env).

5) Engine matrix

Run tests for both:

  • SQLite
  • PostgreSQL

This is required because index behavior and planner behavior diverge by engine.


Seed volume contract (initial targets)

  • BuildMetrics: ~100k rows
  • TestRuns: ~50k rows
  • TestCases: ~500k rows (avg ~10/test run)
  • RawPayloads: ~50k rows

Cardinality targets:

  • ~200 projects
  • ~120 repositories
  • ~400 branches
  • representative platform/environment/test-runner spread

Threshold/regression strategy

Phase 1: stabilization (non-blocking)

  • Run perf workflow nightly / on-demand.
  • Publish artifacts only.

Phase 2: regression gating

  • Use relative regression checks vs baseline (e.g., p95 must not regress >20%).
  • Avoid hardcoded absolute ms thresholds on shared runners.

CI placement

Start as dedicated workflow:

  • workflow_dispatch
  • nightly schedule

After variance/runtime analysis, promote selected checks to PR gating if stable and practical.


Artifact format and consumption

Primary artifact: versioned JSON schema for machine comparison.

Minimum fields:

  • endpoint
  • filter profile
  • engine
  • sample count
  • p50 / p95 / p99
  • mean / stddev
  • baseline delta (%)

Generate Markdown summary from JSON for human review in workflow output/PR comments.


Suggested rollout plan

  1. Capture baseline query timings for key endpoints.
  2. Add PostgreSQL composites first (and optional partial indexes if volume justifies).
  3. Add minimal SQLite composites.
  4. Re-measure and keep only indexes with measurable benefit.

Acceptance criteria (initial)

  • New perf test .csproj committed and runnable locally.
  • Deterministic+randomized seed generator implemented with target volume/cardinality.
  • Seed completion gating implemented (separate from app startup health).
  • Dedicated CI perf workflow runs SQLite + PostgreSQL matrix and uploads JSON + Markdown artifacts.
  • At least one relative regression assertion per critical endpoint (after stabilization phase).
  • Docs added for local execution and artifact interpretation.

Additional implementation notes (important)

A) Seeding performance strategy

Given target volume (~700k+ rows total), do not rely on naive row-by-row EF inserts.

  • PostgreSQL: use bulk-friendly ingestion strategy (batched insert or COPY-style path where feasible).
  • SQLite: prefer high-throughput batched inserts and consider prebuilt seeded DB artifact option for local fast loop.
  • Keep repository-layer shape for realism where practical, but optimize seed path explicitly to avoid CI timeout risk.

B) Warm-up phase before measurement

Before capturing benchmark timings:

  • run warm-up calls for representative endpoint mix (time- or iteration-based),
  • ensure app startup/JIT/connection pool/ORM model initialization effects are not included in measured distribution.

C) Benchmark execution tooling

Do not reinvent stats collection.

Preferred options:

  • .NET-native harness with robust distribution capture,
  • or dedicated load test tool (e.g., NBomber/k6) integrated into workflow.

Whichever path is chosen must output the agreed JSON schema (p50/p95/p99/mean/stddev/delta).

D) Future PR gating reliability

When moving from nightly to PR-gating, use same-job A/B comparison where possible:

  1. run baseline (main),
  2. run candidate (PR),
  3. compare deltas on same runner.

This reduces false positives from noisy shared CI hardware.

E) Index keep/remove decision criteria

Index retention is based on both sides:

  • read-time improvement on target endpoints,
  • write-time/ingest penalty overhead from maintaining additional indexes.

Do not keep indexes that improve read latency marginally while significantly degrading ingest throughput.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions