feat(bench): add release performance comparison workflow#143
Conversation
- Extend vs_linalg with LDLT/Cholesky benchmark rows and shared deterministic inputs. - Add smoke coverage that checks la-stack, nalgebra, and faer agree on benchmark inputs. - Expand bench-compare to support latest-vs-last reports, suite/scope selection, peer baseline context, and clearer malformed Criterion diagnostics. - Document the benchmark methodology, release baseline workflow, roadmap direction, and contributor guidance.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds deterministic benchmark helpers and bench-gating tests, refactors vs_linalg benches toward SPD (LDLT/Cholesky) flows, and extends scripts/bench_compare to be suite/scope-aware with peer-context reporting; updates just recipes and benchmarking/release/contributing docs. ChangesBenchmark Infrastructure and Suite Comparison
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #143 +/- ##
=======================================
Coverage 99.61% 99.61%
=======================================
Files 5 5
Lines 2846 2846
=======================================
Hits 2835 2835
Misses 11 11
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Use a literal regex pattern for the malformed Criterion JSON diagnostic so Windows paths with backslashes do not break pytest's match expression.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scripts/bench_compare.py`:
- Around line 721-726: The hint generator _save_baseline_hint currently drops
the user's requested scope; modify _save_baseline_hint to accept a scope
argument (e.g., def _save_baseline_hint(suite: str, baseline: str, scope: str)
-> str) and include the scope flag in the returned command (append `--scope
{scope}` or only when scope is not the default) for all branches (the "exact",
"vs_linalg" and default cases) so the suggested `bench-save-baseline` command
preserves the original scope; apply the same change to the other similar helper
at the other occurrence mentioned (around the 749-751 block).
In `@scripts/tests/test_bench_compare.py`:
- Around line 153-158: The regex used in the pytest.raises matcher in
test_read_estimate_malformed_json_names_file is vulnerable because the Path
string {est} may contain regex metacharacters; update the test to escape the
path when building the match (e.g. use re.escape(str(est)) or escape only the
{est} portion) so the match is literal, and add the necessary import for re;
keep the assertion calling bench_compare._read_estimate unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 5c930f6f-b3b5-407c-8c47-3e7e31324f1b
📒 Files selected for processing (12)
CONTRIBUTING.mdREADME.mdbenches/common/vs_linalg.rsbenches/vs_linalg.rsdocs/BENCHMARKING.mddocs/RELEASING.mddocs/roadmap.mdjustfilescripts/README.mdscripts/bench_compare.pyscripts/tests/test_bench_compare.pytests/vs_linalg_inputs.rs
| def _save_baseline_hint(suite: str, baseline: str) -> str: | ||
| if suite == "exact": | ||
| return f"just bench-save-baseline {baseline} exact" | ||
| if suite == "vs_linalg": | ||
| return f"just bench-save-baseline {baseline} vs_linalg" | ||
| return f"just bench-save-baseline {baseline}" |
There was a problem hiding this comment.
Preserve scope in the baseline recovery hint.
When the user runs with --scope all-benches, this fallback still suggests the default bench-save-baseline flow. That drops the requested scope and can leave the peer baseline rows missing on the rerun.
Also applies to: 749-751
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@scripts/bench_compare.py` around lines 721 - 726, The hint generator
_save_baseline_hint currently drops the user's requested scope; modify
_save_baseline_hint to accept a scope argument (e.g., def
_save_baseline_hint(suite: str, baseline: str, scope: str) -> str) and include
the scope flag in the returned command (append `--scope {scope}` or only when
scope is not the default) for all branches (the "exact", "vs_linalg" and default
cases) so the suggested `bench-save-baseline` command preserves the original
scope; apply the same change to the other similar helper at the other occurrence
mentioned (around the 749-751 block).
Summary by CodeRabbit
Documentation
Tests
Chores / Tooling