Skip to content

perf: replay cached dry-run diffs for unchanged files, 9-150x faster warm dry-runs#8033

Open
SanderMuller wants to merge 3 commits into
rectorphp:mainfrom
SanderMuller:perf/dry-run-diff-cache
Open

perf: replay cached dry-run diffs for unchanged files, 9-150x faster warm dry-runs#8033
SanderMuller wants to merge 3 commits into
rectorphp:mainfrom
SanderMuller:perf/dry-run-diff-cache

Conversation

@SanderMuller

@SanderMuller SanderMuller commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Replays cached dry-run diffs so warm dry-runs skip reprocessing unchanged files (9-150x). The replay is only sound if the cache invalidates when a dependency changes, so this also carries the dependency-aware capture it builds on — the correctness prerequisite, originally opened as #8028 and now closed in favour of landing it here. Review order: the dependency capture is the first commit, the replay is the last two.

Problem

Files with a pending diff are never marked clean in dry-run mode — correct, the diff must keep being reported — so every warm dry-run reprocesses them from scratch: parse, full PHPStan scope resolution, every rule. On real projects most warm time is exactly this. laravel/framework src/Illuminate with the prepared sets has 1,526 pending diffs: a warm dry-run costs the same as a cold one (220s vs 240s single process).

The cache was also dependency-blind: a clean file stayed skipped on a warm run even when a dependency changed (e.g. a parent gains : int, so a child can now infer its own return type). A diff cache keyed only on a file's own content would replay that stale diff, so the key has to see dependencies first.

Change

  1. Dependency-aware capture. PHPStanNodeScopeResolver runs PHPStan's own DependencyResolver per node and records the surfaced files. Cache entries become {hash, deps:{file:hash}}, re-validated on load; legacy string entries self-upgrade; a failed capture means the file is never cached (no partial sets).

  2. Diff replay. Cache the produced FileDiff keyed on the file's own content hash, the parameter hash and one content hash per captured dependency. When everything still matches on the next run, replay the cached diff instead of reprocessing the file — skipping scope resolution entirely:

// ApplicationFileProcessor::processFile()
if ($useDiffCache) {
    $cachedFileProcessResult = $this->dryRunDiffCache->load($file, $configuration);
    if ($cachedFileProcessResult instanceof FileProcessResult) {
        return $cachedFileProcessResult;
    }
}

Dry-run only: write mode always computes fresh. Selective runs (--only, --only-suffix) bypass the cache entirely. --no-diffs results never cross into normal entries. The original hasChanged flag is replayed, since a rule can report line changes while printing identical content. The parameter hash is memoized per process, as computing it serializes the whole parameter bag and the key needs it per file.

Numbers

corpus pending diffs warm dry-run on main with replay speedup
laravel/framework src/Illuminate, single process 1,526 220-236s 1.5s ~150x
laravel/framework, parallel (14 cores) 1,526 31s 2.0s ~15x
laravel-queue-insights (public, 85 files) 38 4.6s 0.34s ~13x
hihaho/rector-rules (public, 25 files) 2 1.1s 0.28s ~4x
private 138-file corpus 62 2.7s 0.30s ~9x

Output byte-identical to a fresh run in every cache state, verified per measurement. The warm gain scales with how many pending diffs a project has; a fully clean project sees no change. Cold cost is the dependency capture (~7-8% interleaved in this guard-free form); the replay itself adds nothing measurable on top.

Verification

Invalidation is covered end-to-end in tests: own-content change, dependency change (fresh-process simulation), --no-diffs cross-replay, the hasChanged flag round-trip. Replay works in parallel mode (workers save, workers replay).

Comment thread src/NodeAnalyzer/NativeFunctionCallAnalyzer.php Outdated
Comment thread src/NodeTypeResolver/PHPStan/Scope/PHPStanNodeScopeResolver.php Outdated
SanderMuller and others added 3 commits June 11, 2026 11:18
The cache only checked each file's own content, so a clean file stayed
skipped on warm runs even when one of its dependencies changed, e.g. a
parent class method gaining a return type that lets a child file infer
its own. A fresh run reports the new change, a warm run misses it.

PHPStanNodeScopeResolver now records each file's dependencies during
scope resolution using PHPStan's own DependencyResolver, the same engine
behind PHPStan's result cache. Cache entries store the file's own hash
plus one hash per dependency, all re-validated on load; legacy string
entries self-upgrade on the next write. A failed capture skips caching
entirely rather than caching a partial set.

Function calls memoize their dependency files per resolved name, as
signature dependencies are identical at every call site.

Selective runs (--only, --only-suffix) bypass the cache write, same
guard as rectorphp#8029.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Files with a pending diff are never marked clean in dry-run mode, the
diff must keep being reported, so every warm dry-run reprocessed them
from scratch. On a 4,400-file project with 37 pending diffs that was
~11s per run.

Cache the FileDiff with the file's own hash plus one hash per captured
dependency; when all still match, replay the cached diff instead of
reprocessing, skipping scope resolution entirely. Dry-run only: write
mode always computes fresh. --no-diffs results never cross into normal
entries, and the original hasChanged flag is replayed, as a rule can
report line changes while printing identical content.

Warm dry-run on the same project: ~9x faster single process, ~3.5x
parallel. Output stays byte-identical in every cache state.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
SimpleParameterProvider::hash() serializes the whole parameter bag and
contentHash() runs per file, so a warm run paid the serialization once
per file (~46ms per 3,200 calls with a 300-entry skip list).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@SanderMuller SanderMuller force-pushed the perf/dry-run-diff-cache branch from 7a94fea to bd25a20 Compare June 11, 2026 09:18
@SanderMuller SanderMuller changed the title perf: replay cached dry-run diffs for unchanged files, 9-170x faster warm dry-runs perf: replay cached dry-run diffs for unchanged files, 9-150x faster warm dry-runs Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants