Skip to content

chore(webapp,redis-worker): make mollifier constants configurable#3822

Merged
d-cs merged 5 commits into
mainfrom
mollifier-configurable-constants
Jun 4, 2026
Merged

chore(webapp,redis-worker): make mollifier constants configurable#3822
d-cs merged 5 commits into
mainfrom
mollifier-configurable-constants

Conversation

@d-cs
Copy link
Copy Markdown
Collaborator

@d-cs d-cs commented Jun 3, 2026

Summary

The mollifier had ~21 behavioural constants baked in as hardcoded values — the buffer's ack-grace TTL and Redis retry/reconnect tuning, the drainer's poll interval and backoff envelope, the pre-gate idempotency claim TTL/wait/poll, the buffered-run mutate-with-fallback wait loop, the metadata CAS retry budget and backoff, the stale-sweep scan bounds, and the draining-gauge interval. None could be adjusted without a code change, which makes tuning the system under production load impossible.

This exposes all of them as TRIGGER_MOLLIFIER_* environment variables, each defaulting to its previous hardcoded value. Behaviour is identical unless an operator sets a var, so it's a safe no-op deploy.

Design

The package-level classes (MollifierBuffer, MollifierDrainer in @trigger.dev/redis-worker) gain optional constructor options defaulting to the old constants — backward compatible, hence a patch changeset. The webapp factories and worker bootstraps read the env and pass them through. The route- and concern-level pure helpers (mutate-with-fallback, metadata mutation, idempotency claim, stale-sweep state) keep their existing ?? DEFAULT option fallbacks and are fed env values at their call sites, so they stay unit-testable without importing env.server.

Test plan

  • @trigger.dev/redis-worker builds
  • webapp typecheck passes
  • mollifier buffer + drainer testcontainer suites pass (modulo a couple of pre-existing flaky timing tests)
  • Reviewer: confirm the TRIGGER_MOLLIFIER_* env var names match ops conventions

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Jun 3, 2026

🦋 Changeset detected

Latest commit: 7458de2

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 25 packages
Name Type
@trigger.dev/redis-worker Patch
@internal/run-engine Patch
@internal/schedule-engine Patch
@trigger.dev/build Patch
@trigger.dev/core Patch
@trigger.dev/plugins Patch
@trigger.dev/python Patch
@trigger.dev/react-hooks Patch
@trigger.dev/rsc Patch
@trigger.dev/schema-to-json Patch
@trigger.dev/sdk Patch
@trigger.dev/database Patch
@trigger.dev/otlp-importer Patch
@trigger.dev/rbac Patch
trigger.dev Patch
@internal/cache Patch
@internal/clickhouse Patch
@internal/llm-model-catalog Patch
@internal/redis Patch
@internal/replication Patch
@internal/testcontainers Patch
@internal/tracing Patch
@internal/tsql Patch
@internal/zod-worker Patch
@internal/sdk-compat-tests Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 3, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: f7774abe-2805-43f9-bc9e-6d94920950f7

📥 Commits

Reviewing files that changed from the base of the PR and between 1947020 and 1b95033.

📒 Files selected for processing (1)
  • apps/webapp/app/env.server.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • apps/webapp/app/env.server.ts
📜 Recent review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (22)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (8, 8)
  • GitHub Check: packages / 🧪 Unit Tests: Packages (1, 1)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (7, 8)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (2, 8)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (5, 8)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (4, 8)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (2, 8)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (3, 8)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (1, 8)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (8, 8)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (1, 8)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (7, 8)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (3, 8)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (4, 8)
  • GitHub Check: webapp / 🧪 Unit Tests: Webapp (6, 8)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (6, 8)
  • GitHub Check: internal / 🧪 Unit Tests: Internal (5, 8)
  • GitHub Check: typecheck / typecheck
  • GitHub Check: e2e-webapp / 🧪 E2E Tests: Webapp
  • GitHub Check: 🛡️ E2E Auth Tests (full)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (actions)

Walkthrough

This PR makes mollifier buffer and drainer internals configurable via environment variables and wiring. Twenty new TRIGGER_MOLLIFIER_* environment variables control stale-sweep bounds, ACK TTL, Redis retry/reconnect backoff, drainer poll/backoff/shutdown/gauge timing, idempotency claim timing, mutate-with-fallback polling/backoff, and metadata CAS retry backoff. The redis-worker library types and implementation are extended to accept these options; webapp initialization and worker code pass env values into MollifierBuffer, MollifierDrainer, and related components; and API routes thread the tunables into apply/mutate/claim flows.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: making mollifier constants configurable via environment variables.
Description check ✅ Passed The description provides comprehensive context (Summary, Design, Test plan) but lacks some template sections (Closes #issue, Changelog, Screenshots) and the testing section is minimal.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch mollifier-configurable-constants

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@d-cs d-cs self-assigned this Jun 3, 2026
@d-cs d-cs force-pushed the mollifier-configurable-constants branch from 789c916 to 84f98c3 Compare June 3, 2026 15:21
@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented Jun 3, 2026

Open in StackBlitz

@trigger.dev/build

npm i https://pkg.pr.new/@trigger.dev/build@bb4b0af

trigger.dev

npm i https://pkg.pr.new/trigger.dev@bb4b0af

@trigger.dev/core

npm i https://pkg.pr.new/@trigger.dev/core@bb4b0af

@trigger.dev/plugins

npm i https://pkg.pr.new/@trigger.dev/plugins@bb4b0af

@trigger.dev/python

npm i https://pkg.pr.new/@trigger.dev/python@bb4b0af

@trigger.dev/react-hooks

npm i https://pkg.pr.new/@trigger.dev/react-hooks@bb4b0af

@trigger.dev/redis-worker

npm i https://pkg.pr.new/@trigger.dev/redis-worker@bb4b0af

@trigger.dev/rsc

npm i https://pkg.pr.new/@trigger.dev/rsc@bb4b0af

@trigger.dev/schema-to-json

npm i https://pkg.pr.new/@trigger.dev/schema-to-json@bb4b0af

@trigger.dev/sdk

npm i https://pkg.pr.new/@trigger.dev/sdk@bb4b0af

commit: bb4b0af

coderabbitai[bot]

This comment was marked as resolved.

d-cs and others added 2 commits June 4, 2026 09:33
Expose every previously-hardcoded mollifier tunable (buffer ack TTL and
Redis retry/reconnect, drainer poll interval and backoff, idempotency
claim TTL/wait/poll, mutate-fallback wait loop, metadata CAS retries,
stale-sweep scan bounds, draining-gauge interval) via TRIGGER_MOLLIFIER_*
env vars, each defaulting to its prior hardcoded value so behaviour is
unchanged unless overridden.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two box-diagram docs in the mollifier dir for onboarding and tuning:
TRIP.md covers ingress (gate → rate counter → buffer) and DRAINER.md
covers egress (Redis → drainer fan-out → Postgres), each annotating the
env-var lever on every edge.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@d-cs d-cs force-pushed the mollifier-configurable-constants branch from 84f98c3 to 1947020 Compare June 4, 2026 08:38
coderabbitai[bot]

This comment was marked as resolved.

d-cs and others added 3 commits June 4, 2026 10:07
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@d-cs d-cs marked this pull request as ready for review June 4, 2026 09:22
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

⚠️ 1 issue in files not directly in the diff

⚠️ Incomplete wiring: reschedule and tags routes don't pass configurable mutateWithFallback parameters (apps/webapp/app/routes/api.v1.runs.$runParam.reschedule.ts:101-149)

The PR makes mutateWithFallback poll/backoff knobs configurable via env vars (TRIGGER_MOLLIFIER_MUTATE_SAFETY_NET_MS, TRIGGER_MOLLIFIER_MUTATE_POLL_STEP_MS, TRIGGER_MOLLIFIER_MUTATE_MAX_POLL_STEP_MS, TRIGGER_MOLLIFIER_MUTATE_BACKOFF_FACTOR). The cancel route (api.v2.runs.$runParam.cancel.ts:66-69) was updated to pass these from appEnv, but two other production callers of mutateWithFallback were not: api.v1.runs.$runParam.reschedule.ts:101 and api.v1.runs.$runId.tags.ts:64. Those routes still use the hardcoded defaults in mutateWithFallback.server.ts. If an operator tunes, say, TRIGGER_MOLLIFIER_MUTATE_SAFETY_NET_MS to 5000ms, the cancel route respects the override while reschedule and tags routes continue using 2000ms — an inconsistent operator experience that contradicts the PR's stated intent of making these internals configurable.

View 4 additional findings in Devin Review.

Open in Devin Review

@d-cs d-cs enabled auto-merge (squash) June 4, 2026 09:31
@d-cs d-cs merged commit 4ea3ef1 into main Jun 4, 2026
39 checks passed
@d-cs d-cs deleted the mollifier-configurable-constants branch June 4, 2026 09:36
ericallam pushed a commit that referenced this pull request Jun 5, 2026
## Summary
1 new feature, 8 improvements, 1 bug fix.

## Highlights

- Add optional `shouldPauseScaling` to the supervisor consumer pool
scaling options to freeze scale-up while it returns true (scale-down
stays allowed).
([#3836](#3836))

## Improvements
- The MCP server no longer tells the AI agent to wait for a run to
complete after every `trigger_task` call. Waiting is now opt-in: the
agent only waits when you ask it to (for example "trigger and then wait
for it to finish"). This avoids burning tokens polling runs you didn't
need to block on and keeps responses clearer.
([#3838](#3838))
- Update the bundled OpenTelemetry packages to their latest releases
(`@opentelemetry/sdk-node` 0.218.0, `@opentelemetry/core` 2.7.1,
`@opentelemetry/host-metrics` 0.38.3).
([#3810](#3810))
- `envvars.upload` now accepts an optional `isSecret` flag, letting you
create the imported variables as secret (redacted) environment
variables. When omitted, variables default to non-secret.
([#3809](#3809))
- Offload large trigger payloads to object storage before sending the
trigger API request. The SDK uploads packets at or above the existing
128KB limit and sends an `application/store` pointer instead of
embedding large JSON in the request body. `TriggerTaskRequestBody` now
validates that `application/store` payloads are non-empty storage paths.
([#3785](#3785))
- Make mollifier buffer and drainer internals configurable.
`MollifierBuffer` now accepts `ackGraceTtlSeconds`,
`maxRetriesPerRequest`, `reconnectStepMs`, and `reconnectMaxMs` options,
and `MollifierDrainer` accepts `maxBackoffMs` and `backoffFloorMs`. All
default to their previous hardcoded values, so existing behaviour is
unchanged.
([#3822](#3822))
- `MollifierDrainer` accepts a `drainBatchSize` option (default 1) that
controls how many entries are popped per env per tick — in-flight
handlers remain capped by the global `concurrency`. `MollifierBuffer`
also gains `getDrainingCount()` / `listStaleDraining()`, backed by a new
`mollifier:draining` ZSET maintained atomically with
pop/ack/fail/requeue (observability-only).
([#3797](#3797))
- Adds AI SDK 7 support. The `ai` peer range now includes v7, and the
`chat.agent` / chat surfaces work against v7's ESM-only build. On v7,
install `@ai-sdk/otel` alongside `ai` and the SDK registers it for you
so `experimental_telemetry` spans keep flowing into your run traces (v7
stopped emitting them from `ai` core). v5 and v6 keep working unchanged.
([#3833](#3833))
- `useTriggerChatTransport` now recovers when restored session state
points at a session that no longer exists in the current environment
([#3816](#3816))

## Bug fixes
- Fix `@trigger.dev/core` build: cast the underlying log record exporter
when calling `forceFlush` so it typechecks against the updated
OpenTelemetry `LogRecordExporter` type (which no longer declares
`forceFlush`).
([#3829](#3829))

<details>
<summary>Raw changeset output</summary>

⚠️⚠️⚠️⚠️⚠️⚠️

`main` is currently in **pre mode** so this branch has prereleases
rather than normal releases. If you want to exit prereleases, run
`changeset pre exit` on `main`.

⚠️⚠️⚠️⚠️⚠️⚠️

# Releases
## @trigger.dev/build@4.5.0-rc.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.5`

## trigger.dev@4.5.0-rc.5

### Patch Changes

- The MCP server no longer tells the AI agent to wait for a run to
complete after every `trigger_task` call. Waiting is now opt-in: the
agent only waits when you ask it to (for example "trigger and then wait
for it to finish"). This avoids burning tokens polling runs you didn't
need to block on and keeps responses clearer.
([#3838](#3838))
- Update the bundled OpenTelemetry packages to their latest releases
(`@opentelemetry/sdk-node` 0.218.0, `@opentelemetry/core` 2.7.1,
`@opentelemetry/host-metrics` 0.38.3).
([#3810](#3810))
-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.5`
    -   `@trigger.dev/build@4.5.0-rc.5`
    -   `@trigger.dev/schema-to-json@4.5.0-rc.5`

## @trigger.dev/core@4.5.0-rc.5

### Patch Changes

- Add optional `shouldPauseScaling` to the supervisor consumer pool
scaling options to freeze scale-up while it returns true (scale-down
stays allowed).
([#3836](#3836))

- Fix `@trigger.dev/core` build: cast the underlying log record exporter
when calling `forceFlush` so it typechecks against the updated
OpenTelemetry `LogRecordExporter` type (which no longer declares
`forceFlush`).
([#3829](#3829))

- `envvars.upload` now accepts an optional `isSecret` flag, letting you
create the imported variables as secret (redacted) environment
variables. When omitted, variables default to non-secret.
([#3809](#3809))

    ```ts
    await envvars.upload("proj_1234", "prod", {
      variables: { STRIPE_SECRET_KEY: "sk_live_..." },
      isSecret: true,
    });
    ```

- Offload large trigger payloads to object storage before sending the
trigger API request. The SDK uploads packets at or above the existing
128KB limit and sends an `application/store` pointer instead of
embedding large JSON in the request body. `TriggerTaskRequestBody` now
validates that `application/store` payloads are non-empty storage paths.
([#3785](#3785))

Payload uploads use the same resolved `ApiClient` as the trigger call
(including `requestOptions.clientConfig`), not only the global
`apiClientManager.client` — so custom `baseURL`, access token, and
preview branch apply to both presign and trigger.

- Update the bundled OpenTelemetry packages to their latest releases
(`@opentelemetry/sdk-node` 0.218.0, `@opentelemetry/core` 2.7.1,
`@opentelemetry/host-metrics` 0.38.3).
([#3810](#3810))

## @trigger.dev/plugins@4.5.0-rc.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.5`

## @trigger.dev/python@4.5.0-rc.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/sdk@4.5.0-rc.5`
    -   `@trigger.dev/core@4.5.0-rc.5`
    -   `@trigger.dev/build@4.5.0-rc.5`

## @trigger.dev/react-hooks@4.5.0-rc.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.5`

## @trigger.dev/redis-worker@4.5.0-rc.5

### Patch Changes

- Make mollifier buffer and drainer internals configurable.
`MollifierBuffer` now accepts `ackGraceTtlSeconds`,
`maxRetriesPerRequest`, `reconnectStepMs`, and `reconnectMaxMs` options,
and `MollifierDrainer` accepts `maxBackoffMs` and `backoffFloorMs`. All
default to their previous hardcoded values, so existing behaviour is
unchanged.
([#3822](#3822))
- `MollifierDrainer` accepts a `drainBatchSize` option (default 1) that
controls how many entries are popped per env per tick — in-flight
handlers remain capped by the global `concurrency`. `MollifierBuffer`
also gains `getDrainingCount()` / `listStaleDraining()`, backed by a new
`mollifier:draining` ZSET maintained atomically with
pop/ack/fail/requeue (observability-only).
([#3797](#3797))
-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.5`

## @trigger.dev/rsc@4.5.0-rc.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.5`

## @trigger.dev/schema-to-json@4.5.0-rc.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.5`

## @trigger.dev/sdk@4.5.0-rc.5

### Patch Changes

- Adds AI SDK 7 support. The `ai` peer range now includes v7, and the
`chat.agent` / chat surfaces work against v7's ESM-only build. On v7,
install `@ai-sdk/otel` alongside `ai` and the SDK registers it for you
so `experimental_telemetry` spans keep flowing into your run traces (v7
stopped emitting them from `ai` core). v5 and v6 keep working unchanged.
([#3833](#3833))

- `useTriggerChatTransport` now recovers when restored session state
points at a session that no longer exists in the current environment
([#3816](#3816))

- Offload large trigger payloads to object storage before sending the
trigger API request. The SDK uploads packets at or above the existing
128KB limit and sends an `application/store` pointer instead of
embedding large JSON in the request body. `TriggerTaskRequestBody` now
validates that `application/store` payloads are non-empty storage paths.
([#3785](#3785))

Payload uploads use the same resolved `ApiClient` as the trigger call
(including `requestOptions.clientConfig`), not only the global
`apiClientManager.client` — so custom `baseURL`, access token, and
preview branch apply to both presign and trigger.

- Update the bundled OpenTelemetry packages to their latest releases
(`@opentelemetry/sdk-node` 0.218.0, `@opentelemetry/core` 2.7.1,
`@opentelemetry/host-metrics` 0.38.3).
([#3810](#3810))

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.5`

</details>

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants