Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .changeset/backoff-maxelapsed-wallclock.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
"@trigger.dev/core": patch
---

Fix `ExponentialBackoff.execute` ignoring the `maxElapsed` boundary. The retry loop now stops once the real wall-clock time spent (callback duration plus sleeps) reaches `maxElapsed`, instead of only counting the summed sleep delays.
71 changes: 71 additions & 0 deletions packages/core/src/v3/apps/backoff.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
import { describe, expect, it } from "vitest";

import { ExponentialBackoff } from "./backoff.js";

function sleep(ms: number) {
return new Promise<void>((resolve) => setTimeout(resolve, ms));
}

describe("ExponentialBackoff.execute", () => {
it("stops retrying once real wall-clock time exceeds maxElapsed", async () => {
const backoff = new ExponentialBackoff("NoJitter", {
factor: 0,
maxRetries: 1000,
maxElapsed: 0.05,
});

let attempts = 0;

const result = await backoff.execute(async () => {
attempts++;
await sleep(15);
throw new Error("always fails");
});

expect(result.success).toBe(false);
expect(attempts).toBeGreaterThan(1);
expect(attempts).toBeLessThan(1000);
});

it("returns the result when the callback succeeds", async () => {
const backoff = new ExponentialBackoff("NoJitter", {
factor: 0,
maxRetries: 5,
maxElapsed: 1,
});

let attempts = 0;

const result = await backoff.execute(async () => {
attempts++;
if (attempts < 3) {
throw new Error("not yet");
}
return "ok";
});

expect(result).toEqual({ success: true, result: "ok" });
expect(attempts).toBe(3);
});

it("stops at maxRetries when callbacks are fast and keep failing", async () => {
const backoff = new ExponentialBackoff("NoJitter", {
factor: 0,
maxRetries: 3,
maxElapsed: 60,
});

let attempts = 0;

const result = await backoff.execute(async () => {
attempts++;
throw new Error("always fails");
});

expect(result.success).toBe(false);
if (!result.success) {
expect(result.cause).toBe("MaxRetries");
}
expect(attempts).toBe(4);
});
});
4 changes: 4 additions & 0 deletions packages/core/src/v3/apps/backoff.ts
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 maxElapsed wall-clock check bypassed when AttemptTimeout fires due to continue

The new maxElapsed wall-clock check at line 344 is placed after the try/catch/finally block. When an AttemptTimeout is caught at line 336-337, the continue statement jumps to the next loop iteration, skipping the maxElapsed check entirely. While the finally block at line 339 correctly updates elapsedMs, the break condition at line 344 is never evaluated for timeout iterations.

This means that if a caller passes a non-zero attemptTimeoutMs, and callbacks keep timing out, the loop will ignore the maxElapsed boundary and continue retrying until maxRetries is exhausted or the iterator's own delay-based elapsed check (packages/core/src/v3/apps/backoff.ts:130) triggers — which only accounts for computed delay values, not real wall-clock time spent in timed-out callbacks.

Fix: move the check before continue or into finally

The maxElapsed check should be evaluated before the continue statement, e.g.:

if (error instanceof AttemptTimeout) {
  if (elapsedMs >= this.#maxElapsed * 1000) {
    break;
  }
  continue;
}

However, elapsedMs is only updated in the finally block which runs after the catch block, so the check would need to be restructured — either by moving the elapsed calculation before the check, or by placing the maxElapsed check inside the finally block.

(Refers to lines 336-338)

Prompt for agents
In packages/core/src/v3/apps/backoff.ts, the execute() method has a new maxElapsed wall-clock check at line 344 that is placed after the try/catch/finally block. However, the AttemptTimeout handler at line 336-338 uses 'continue' which skips past this check. The finally block at line 339-342 does correctly update elapsedMs before continue takes effect, but the break-check at line 344 is never reached for timeout iterations.

The fix requires restructuring: the maxElapsed check needs to happen AFTER elapsedMs is updated (in finally) but BEFORE continue skips it. Options:
1. Move the maxElapsed check into the finally block (but be careful: finally also runs on success/StopRetrying returns)
2. Update elapsedMs before the continue, then check maxElapsed, something like: compute elapsed in the catch block before continue, then check.
3. Use a flag set in finally and check it at the top of the next iteration.

The cleanest approach is probably to restructure the catch block for AttemptTimeout to not use continue — instead let it fall through to the maxElapsed check like normal errors do.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Original file line number Diff line number Diff line change
Expand Up @@ -340,6 +340,10 @@ export class ExponentialBackoff {
elapsedMs += Date.now() - start;
clearTimeout(attemptTimeout);
}

if (elapsedMs >= this.#maxElapsed * 1000) {
break;
}
Comment on lines +344 to +346
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 When maxElapsed is exceeded, cause is reported as "MaxRetries" which is semantically inaccurate

When the new wall-clock maxElapsed check triggers a break at line 345, the code falls through to line 354-359 which returns cause: "MaxRetries". This is semantically incorrect — the loop stopped because of elapsed time, not because retries were exhausted. The return type union at line 289 ("StopRetrying" | "Timeout" | "MaxRetries") doesn't include a "MaxElapsed" variant. The test at packages/core/src/v3/apps/backoff.test.ts:67 explicitly asserts cause: "MaxRetries" for the maxElapsed case, so this appears intentional to avoid a breaking change to the return type. However, callers who rely on cause to distinguish between these two exit conditions won't be able to tell them apart. This is a design trade-off rather than a bug, but worth considering adding a "MaxElapsed" cause in a future minor version.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

}

if (finalError instanceof AttemptTimeout) {
Expand Down