fix(laminar): drain OTel processors before process.exit() to stop losing final agent spans#50
Merged
Alezander9 merged 1 commit intomainfrom May 9, 2026
Merged
Conversation
…ing final agent spans Eval traces were missing the final LLM span on ~27% of runs (gemini-3-flash 76%, glm-5.1 34%, mimo 14%, gpt-5.5 1%, claude 0%). Root cause is a process-exit race: the plugin event hook in packages/opencode/src/plugin/index.ts:249 is invoked fire-and-forget (`void hook["event"]?.(...)`), so the bcode-laminar `session.idle` handler's `processor.forceFlush()` Promise is discarded, and the unconditional `process.exit()` in the top-level `finally` (index.ts:252) kills in-flight gRPC exports. Model-dependence comes from emit shape: tool-only-then-final-text models (glm-5.1, gemini-3-flash) make one extra tool-less LLM round at the end whose lone `ai.streamText.doStream` span ends 50-200ms before idle and is the freshest unflushed thing in the BatchSpanProcessor queue at exit. Tool-call+ text-in-same-step models (claude-opus, gpt-5.5) fold the final answer into a step that ended seconds earlier and was already flushed in a prior batch. Fix: in the top-level `finally` of index.ts, before `process.exit()`, fetch the global OTel TracerProvider via `@opentelemetry/api`, duck-check `forceFlush`, and race it against a 3 s timeout so a wedged exporter cannot hang bcode on exit. Generic to any OTel-based plugin. Does not touch the deeper bug (the fire-and-forget `event` hook); that is the proper upstream fix to anomalyco/opencode.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Laminar traces were missing the final agent LLM span on ~27% of overnight eval runs. Eddie's measurements:
Score impact is nil (the runner reads agent steps from
--format jsonstdout, not from spans), but it makes the Laminar UI useless for any human dive into a glm/gemini trace — the closing text-emit LLM call is just not there.Root cause
Three problems compose:
packages/opencode/src/plugin/index.ts:249doesvoid hook["event"]?.({event}). Thebcode-laminarsession.idlehandler ends the turn span andawait processor.forceFlush(), but its Promise is discarded by the bus subscriber.process.exit()is unconditional.packages/opencode/src/index.ts:252calls it in the top-levelfinally. Node's event loop is killed mid-flight; any in-flight gRPC export from the BatchSpanProcessor is severed.maxExportBatchSize=512,scheduledDelayMillis=5000ms. The very last spans of a run sit in the queue waiting either for the 5 s timer or our forceFlush — and forceFlush loses the race toprocess.exit().Why it's model-dependent
The bug only ever bites the very last LLM span. Models split into two camps:
doStreamspan ended many seconds before idle and was already flushed in an earlier batch. → 0–1% miss.finishReason=stop. That lone finaldoStreamspan ends 50–200 ms before idle, so it's the freshest unflushed thing in the BSP queue whenprocess.exit()fires. → 34–76% miss. Gemini-3-flash is the worst because its TTFB is fastest, so the gap between span-end and idle is the smallest.It's a clean continuum on "fraction of runs where the closing text is in a separate, late, lone span."
What this PR does
In the top-level
finallyofpackages/opencode/src/index.ts, beforeprocess.exit():TracerProvidervia@opentelemetry/api(already a dep ofpackages/opencode).forceFlush(the API-level type doesn't declare it;BasicTracerProviderfrom@opentelemetry/sdk-trace-base, whichNodeSDK.start()registers, implements it).Promise.raceagainst a 3 s timeout so a wedged exporter cannot hang bcode on exit.Generic OTel-API approach with no laminar import — benefits any future tracing plugin equally (DataDog, New Relic, OTLP collector, etc.).
What this PR does NOT do
scheduledDelayMillis. The default 5 s exists to batch many spans into one gRPC call. Reducing it to 1 s would multiply gRPC traffic during long runs (~5×) without addressing the root race; once we drain at exit, the queue is fully flushed regardless of timer. Leaving the BSP defaults alone.src/plugin/index.ts:249'svoid hook["event"]?.(...)). That's the proper upstream fix; should be a PR toanomalyco/opencode. This patch is a tactical drain at the only place we control unconditionally — the exit point.Risk / blast radius
+12 lines in
index.ts. One Yellow-zone touch (logged inEXCEPTIONS.md§ "Drain OTel processors beforeprocess.exit()"). Behavior when no OTel provider is registered: the duck-check fails and the exit path is unchanged. Behavior when LMNR_PROJECT_API_KEY is unset (laminar plugin no-ops): the global TracerProvider is the API noop provider with noforceFlush, duck-check fails, exit path is unchanged. Behavior when exporter is wedged: bounded 3 s wait, thenprocess.exit()fires anyway.Expected effect on the missing-final rate: ~27% → near-zero across all models.
Verify
bun typecheckclean (filtered + per-package).