Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/benchmarks.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Benchmarks

# Detect performance regressions in exact-arithmetic benchmarks.
# Detect performance regressions.
# - Push to main: run benchmarks, save Criterion baseline, upload as artifact.
# - PRs: find latest main baseline artifact, download, compare, report.
# Regressions are warning-only — this workflow never fails on regression.
Expand Down
87 changes: 87 additions & 0 deletions .github/workflows/release-benchmarks.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
name: Release Benchmarks

# Archive full Criterion benchmark baselines for published releases.
# Release jobs publish durable artifacts, so they intentionally do not restore
# or save dependency caches.

permissions:
contents: write

on:
release:
types:
- published

concurrency:
group: release-benchmarks-${{ github.event.release.tag_name }}
cancel-in-progress: false

env:
CARGO_TERM_COLOR: always
RUST_BACKTRACE: 1

jobs:
release-baseline:
runs-on: ubuntu-latest
timeout-minutes: 60

steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: ${{ github.event.release.tag_name }}
persist-credentials: false

- name: Install Rust toolchain
uses: actions-rust-lang/setup-rust-toolchain@46268bd060767258de96ed93c1251119784f2ab6 # v1.16.1
with:
cache: false

- name: Save release Criterion baseline
env:
RELEASE_TAG: ${{ github.event.release.tag_name }}
run: |
set -euo pipefail

cargo bench --features bench --bench vs_linalg -- --save-baseline "$RELEASE_TAG"
cargo bench --features bench,exact --bench exact -- --save-baseline "$RELEASE_TAG"

- name: Package release Criterion baseline
id: package-baseline
env:
RELEASE_TAG: ${{ github.event.release.tag_name }}
run: |
set -euo pipefail

asset="la-stack-${RELEASE_TAG}-criterion-baseline.tar.gz"
tar -C target -czf "$asset" criterion
echo "asset=$asset" >> "$GITHUB_OUTPUT"

- name: Upload temporary baseline artifact
uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7.0.1
with:
name: bench-baseline-${{ github.event.release.tag_name }}
path: ${{ steps.package-baseline.outputs.asset }}
retention-days: 30
if-no-files-found: error

- name: Attach baseline to GitHub Release
env:
GH_TOKEN: ${{ github.token }}
RELEASE_TAG: ${{ github.event.release.tag_name }}
RELEASE_ASSET: ${{ steps.package-baseline.outputs.asset }}
run: |
set -euo pipefail

gh release upload "$RELEASE_TAG" "$RELEASE_ASSET" --clobber

- name: Release baseline summary
env:
RELEASE_ASSET: ${{ steps.package-baseline.outputs.asset }}
run: |
set -euo pipefail

{
echo "### Release Benchmark Baseline"
echo ""
echo "Uploaded release asset: \`$RELEASE_ASSET\`"
} >> "$GITHUB_STEP_SUMMARY"
29 changes: 27 additions & 2 deletions docs/BENCHMARKING.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,9 @@ just bench-latest-vs-last
## Comparing performance across releases

Criterion baselines are saved into `target/criterion/` and persist across
`git checkout` but **not** across `cargo clean`.
`git checkout` but **not** across `cargo clean`. Published releases also attach
a compressed Criterion baseline to the GitHub Release so historical release
baselines can be restored later.

### Latest vs last

Expand Down Expand Up @@ -158,6 +160,26 @@ just bench-compare v0.2.0

You can save multiple baselines and compare against any of them.

If the release baseline is already present in `target/criterion/`, skip the
checkout step and compare directly. For example, to compare current code against
the saved `v0.4.2` release baseline:

```bash
just bench-latest # gather latest la-stack measurements
just bench-compare v0.4.2 # compare latest measurements against v0.4.2
```

If the release baseline is not present locally, download and restore the release
asset first:

```bash
gh release download v0.4.2 --pattern "la-stack-v0.4.2-criterion-baseline.tar.gz" # fetch archived release baseline
mkdir -p target # ensure Criterion parent directory exists
tar -C target -xzf la-stack-v0.4.2-criterion-baseline.tar.gz # restore target/criterion baseline data
just bench-latest # gather latest la-stack measurements
just bench-compare v0.4.2 # compare latest measurements against v0.4.2
```

### Output

`just bench-compare` writes `target/bench-reports/performance.md` by
Expand Down Expand Up @@ -187,11 +209,14 @@ See `scripts/criterion_dim_plot.py --help` for options.

## Release workflow

At release time, save a baseline so future work can compare against it:
At release time, save a local baseline so future work can compare against it:

```bash
just bench-save-baseline $TAG
just bench-save-last
```

When the GitHub Release is published, `.github/workflows/release-benchmarks.yml`
saves a full release baseline and attaches
`la-stack-$TAG-criterion-baseline.tar.gz` to the release as the durable archive.
See `docs/RELEASING.md` step 5 for where this fits in the release process.
16 changes: 12 additions & 4 deletions docs/RELEASING.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,10 +111,18 @@ just bench-save-baseline $TAG
just bench-save-last
```

These baselines can be compared against in future optimization work. The
default local report command, `just bench-compare`, compares latest
measurements against `last` and writes `target/bench-reports/performance.md`;
it does not update README benchmark tables or committed release artifacts.
These baselines can be compared against in future optimization work on the
release branch. The default local report command, `just bench-compare`, compares
latest measurements against `last` and writes
`target/bench-reports/performance.md`; it does not update README benchmark
tables or committed release artifacts.

After the GitHub Release is published, the `Release Benchmarks` workflow checks
out the release tag, saves a full Criterion baseline, and attaches
`la-stack-$TAG-criterion-baseline.tar.gz` to the release. That release asset is
the durable archive for historical baseline comparisons; the workflow also
uploads a short-lived Actions artifact for debugging the run.

See `docs/BENCHMARKING.md` for the full comparison workflow.

6. Validate the release branch
Expand Down
Loading