fix(core): enforce maxElapsed against wall-clock time in ExponentialB…#3779
fix(core): enforce maxElapsed against wall-clock time in ExponentialB…#3779singhharsh1708 wants to merge 1 commit into
Conversation
…ackoff.execute fixes triggerdotdev#3726
🦋 Changeset detectedLatest commit: 09b5c46 The changes in this PR will be included in the next version bump. This PR includes changesets to release 32 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
Hi @singhharsh1708, thanks for your interest in contributing! This project requires that pull request authors are vouched, and you are not in the list of vouched users. This PR will be closed automatically. See https://github.com/triggerdotdev/trigger.dev/blob/main/CONTRIBUTING.md for more details. |
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
WalkthroughThis PR fixes Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🟡 maxElapsed wall-clock check bypassed when AttemptTimeout fires due to continue
The new maxElapsed wall-clock check at line 344 is placed after the try/catch/finally block. When an AttemptTimeout is caught at line 336-337, the continue statement jumps to the next loop iteration, skipping the maxElapsed check entirely. While the finally block at line 339 correctly updates elapsedMs, the break condition at line 344 is never evaluated for timeout iterations.
This means that if a caller passes a non-zero attemptTimeoutMs, and callbacks keep timing out, the loop will ignore the maxElapsed boundary and continue retrying until maxRetries is exhausted or the iterator's own delay-based elapsed check (packages/core/src/v3/apps/backoff.ts:130) triggers — which only accounts for computed delay values, not real wall-clock time spent in timed-out callbacks.
Fix: move the check before continue or into finally
The maxElapsed check should be evaluated before the continue statement, e.g.:
if (error instanceof AttemptTimeout) {
if (elapsedMs >= this.#maxElapsed * 1000) {
break;
}
continue;
}However, elapsedMs is only updated in the finally block which runs after the catch block, so the check would need to be restructured — either by moving the elapsed calculation before the check, or by placing the maxElapsed check inside the finally block.
(Refers to lines 336-338)
Prompt for agents
In packages/core/src/v3/apps/backoff.ts, the execute() method has a new maxElapsed wall-clock check at line 344 that is placed after the try/catch/finally block. However, the AttemptTimeout handler at line 336-338 uses 'continue' which skips past this check. The finally block at line 339-342 does correctly update elapsedMs before continue takes effect, but the break-check at line 344 is never reached for timeout iterations.
The fix requires restructuring: the maxElapsed check needs to happen AFTER elapsedMs is updated (in finally) but BEFORE continue skips it. Options:
1. Move the maxElapsed check into the finally block (but be careful: finally also runs on success/StopRetrying returns)
2. Update elapsedMs before the continue, then check maxElapsed, something like: compute elapsed in the catch block before continue, then check.
3. Use a flag set in finally and check it at the top of the next iteration.
The cleanest approach is probably to restructure the catch block for AttemptTimeout to not use continue — instead let it fall through to the maxElapsed check like normal errors do.
Was this helpful? React with 👍 or 👎 to provide feedback.
| if (elapsedMs >= this.#maxElapsed * 1000) { | ||
| break; | ||
| } |
There was a problem hiding this comment.
🚩 When maxElapsed is exceeded, cause is reported as "MaxRetries" which is semantically inaccurate
When the new wall-clock maxElapsed check triggers a break at line 345, the code falls through to line 354-359 which returns cause: "MaxRetries". This is semantically incorrect — the loop stopped because of elapsed time, not because retries were exhausted. The return type union at line 289 ("StopRetrying" | "Timeout" | "MaxRetries") doesn't include a "MaxElapsed" variant. The test at packages/core/src/v3/apps/backoff.test.ts:67 explicitly asserts cause: "MaxRetries" for the maxElapsed case, so this appears intentional to avoid a breaking change to the return type. However, callers who rely on cause to distinguish between these two exit conditions won't be able to tell them apart. This is a design trade-off rather than a bug, but worth considering adding a "MaxElapsed" cause in a future minor version.
Was this helpful? React with 👍 or 👎 to provide feedback.
Closes #3726
✅ Checklist
Testing
[Describe the steps you took to test this change]
Changelog
[Short description of what has changed]
Screenshots
[Screenshots]
💯