Skip to content

fix(stacks): mark promote approval executed AFTER preflight, not before (#11)#242

Merged
mastermanas805 merged 2 commits into
masterfrom
fix/promote-mark-executed-after-preflight-2026-06-04
Jun 4, 2026
Merged

fix(stacks): mark promote approval executed AFTER preflight, not before (#11)#242
mastermanas805 merged 2 commits into
masterfrom
fix/promote-mark-executed-after-preflight-2026-06-04

Conversation

@mastermanas805
Copy link
Copy Markdown
Member

What

Closes sweep finding #11 (P2): a promote approval was burned to status='executed' BEFORE the promote preflight ran. Any preflight failure stranded the single-use approval as consumed and non-retryable.

Promote called MarkPromoteApprovalExecuted (inside consumeApprovedPromote) immediately after validating the approval_id — but BEFORE the preflight (source-services fetch, image_ref check, target create/update, vault copy, env load). A preflight failure (412 missing_image_ref / no_services, 503 lookup/create/env_load, 400 vault, 402 cap) therefore consumed the approval while the promote never launched, forcing the operator to request a fresh email approval.

Fix

Split consumeApprovedPromote into two halves:

  • validateApprovedPromote — read-only checks (uuid, lookup, ownership, status, from/to/kind match, expiry). Runs at the original position so an invalid approval still 4xxs early. Returns the row; no mutation.
  • markApprovedPromoteExecuted — the executed CAS flip + audit. Called ONLY after preflight fully succeeds, immediately before the runStackDeploy launch.

A preflight failure now leaves the approval approved and retryable. Single-use is preserved: MarkPromoteApprovalExecuted's CAS still returns 0 rows → 409 approval_already_executed on a concurrent double-consume.

twin.go's consumeApprovedTwin is a separate consume path and out of scope for this fix.

Coverage block

Symptom:        preflight 412/503 strands a consumed (status='executed') approval
Enumeration:    MarkPromoteApprovalExecuted + consumeApprovedPromote call sites in stack.go
Sites found:    1  (stack.go Promote; twin.go has a separate consumeApprovedTwin)
Sites touched:  1
Coverage test:  TestStackPromote_ApprovalID_PreflightFails_StaysApproved (missing_image_ref),
                TestStackPromote_ApprovalID_NoServices_StaysApproved,
                TestStackPromote_ApprovalID_HappyPath_ExecutesOnce (executed once + single-use),
                TestMarkApprovedPromoteExecuted_AlreadyExecuted_409 (CAS-miss arm),
                TestStackFinal_ConsumeApproved_ExecuteError_503 (updated for new call order)
Live verified:  awaiting post-merge auto-deploy (rule 14 build-SHA gate in CI)

100% patch coverage (both split functions + the new call sites). go vet clean; gofmt clean.

🤖 Generated with Claude Code

…re (#11)

The Promote handler called MarkPromoteApprovalExecuted inside
consumeApprovedPromote BEFORE the promote preflight (source-services fetch,
image_ref check, target create/update, vault copy, env load) ran. Any
preflight failure (412 missing_image_ref / no_services, 503 lookup/create/
env_load, 400 vault, 402 cap) therefore burned the single-use approval to
'executed' while the promote never launched — stranding the operator with a
consumed, non-retryable approval and forcing a fresh email round-trip.

Split consumeApprovedPromote into:
  - validateApprovedPromote — read-only checks (uuid, lookup, ownership,
    status, from/to/kind match, expiry), runs at the original position so an
    invalid approval still 4xxs early. Returns the row, no mutation.
  - markApprovedPromoteExecuted — the 'executed' CAS flip + audit, called
    ONLY after preflight fully succeeds, immediately before the runStackDeploy
    launch.

A preflight failure now leaves the approval 'approved' and retryable. The
single-use guarantee is preserved: MarkPromoteApprovalExecuted's CAS still
returns 0 rows (409 approval_already_executed) on a concurrent double-consume.

Twin promote (consumeApprovedTwin in twin.go) has its own consume path and is
out of scope for this fix.

Coverage:
  Symptom:        preflight 412/503 strands a consumed (status='executed') approval
  Enumeration:    rg MarkPromoteApprovalExecuted + consumeApprovedPromote call sites in stack.go
  Sites found:    1 (stack.go Promote; twin.go has a separate consumeApprovedTwin)
  Sites touched:  1
  Coverage test:  TestStackPromote_ApprovalID_PreflightFails_StaysApproved (missing_image_ref),
                  TestStackPromote_ApprovalID_NoServices_StaysApproved,
                  TestStackPromote_ApprovalID_HappyPath_ExecutesOnce (executed once + single-use),
                  TestMarkApprovedPromoteExecuted_AlreadyExecuted_409 (CAS-miss arm),
                  TestStackFinal_ConsumeApproved_ExecuteError_503 (updated for new call order)
  Live verified:  awaiting post-merge auto-deploy (rule 14 build-SHA gate in CI)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mastermanas805 mastermanas805 enabled auto-merge (squash) June 4, 2026 15:40
@mastermanas805 mastermanas805 merged commit 9aba000 into master Jun 4, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant