feat(ce-dispatch): single-unit sync MVP rewrite by devin-ai-integration[bot] · Pull Request #4 · Fedgroup-Innovation/compound-engineering-plugin

devin-ai-integration · 2026-05-04T21:28:52Z

Summary

Rewrites ce-dispatch from multi-unit fan-out to a single-unit sync MVP. Each invocation creates one GitHub issue for one implementation unit; orchestrator and the in-workspace agent coordinate via issue comments and the resulting PR; the user pings each side manually.

The skill stays at plugins/compound-engineering/skills/ce-dispatch/ (no rename) and keeps the beta-skills triplet (name: ce-dispatch, [BETA] description prefix, disable-model-invocation: true) — the directory was never -beta; the markers ride in the frontmatter, per the beta-skills framework.

Behavior changes

Phase 0 — drop dispatch_mode and dispatch_auto_review; keep dispatch_branch_prefix, dispatch_base_branch, dispatch_labels. New Phase 0.3 asks the user for the worktree absolute path (the user creates the Conductor workspace before invoking the skill, which closes the chicken-and-egg of needing the worktree before the issue exists). Worktree dirname becomes the agent name used in comments.

Phase 1 — pick exactly one implementation unit. Drop dependency-graph construction, parallel-safety overlap checks, unit coalescing.

Phase 2 — render a single self-contained dispatch prompt. Populate three new template sections (<orientation>, <agent-identity>, <comment-protocol>) alongside existing ones. Drop multi-unit coalescing branches.

Phase 3 — create exactly one issue via gh.

Phase 4 — collapse the six-option monitor loop to a four-option respond menu (Reply to agent comment / Review the PR / Mark unit complete / Done for now). Drop dependency-aware merge gating, dependency-graph rendering, auto re-dispatch, auto-review. Compose existing CE skills (ce-code-review, ce-resolve-pr-feedback) for the PR-feedback round-trip.

Prompt template changes

Add <orientation> for progressive context exposure (path list, not inlined content) — README, AGENTS.md, plan, architecture doc, unit's pattern files, unit's Files: paths.
Add <agent-identity> carrying agent-name and worktree-path.
Add <comment-protocol> with the [<agent-name> -> orchestrator] timestamped format and explicit STOP-after-asking directive — closes the airgap concern (in-workspace agent must not make architectural decisions without orchestrator input).
Rework <ce-plugin> as an explicit nine-step compound-engineering loop: read orientation -> /ce-work -> implement and verify -> /ce-code-review -> /ce-compound (optional) -> /ce-commit-push-pr -> comment with PR URL -> stop and wait -> on ping run /ce-resolve-pr-feedback. Loop.
Drop dependencies: from metadata footer; rename unit_ids: -> unit_id:; add agent_name: and worktree_path:.

Configs / ce-plan routing / conductor-notes

Mirror dropped/retained keys across .compound-engineering/config.local.example.yaml and ce-setup/references/config-template.yaml.
Update the inline routing description for the "Dispatch to external agents" menu option in ce-plan SKILL.md and plan-handoff.md to reflect single-unit shape; drop fan-out / parallel-execution language. Menu label and position (option 4 of 5) are unchanged; platform skill-invocation primitive guidance is preserved.
Light revisions to conductor-notes.md metadata-comment description so it lists the single-unit-MVP keys and states dependency-graph state is intentionally absent.

Regression guards preserved

Verbatim from EveryInc#762:

gh pr list invocations in Phase 4 require --state all (otherwise merged PRs are invisible to the respond loop).
git symbolic-ref invocations for origin/HEAD require --short (otherwise the full ref path leaks into dispatch metadata).

Contract tests

Rewrote tests/skills/ce-dispatch-contract.test.ts end-to-end: dropped tests for removed behavior, updated tests for changed behavior, added tests for new behavior (orientation / agent-identity / comment-protocol sections; nine-step ce-plugin sequence; STOP-and-wait directive; comment-protocol prefix shape; ce-resolve-pr-feedback routing in Phase 4; ce-plan routing wording without fan-out language). 63/63 pass.

Validation

Targeted contract test: 63/63 pass.
bun test: 1307/1308 pass. The single failing test (resolve-base.sh > resolves against origin/HEAD in a detached shallow checkout) is a pre-existing environmental issue unrelated to this change — already failing on baseline main (verified Loop 1 baseline before any edits).
bun run release:validate: clean.

Review & Testing Checklist for Human

This is yellow risk: structural rewrite of an in-flight beta skill. The skill is disable-model-invocation: true, so accidental invocation by the planner is gated, but human-driven /ce-dispatch <plan> flows go through this code.

Skim the new <ref_file file="/home/ubuntu/repos/compound-engineering-plugin/plugins/compound-engineering/skills/ce-dispatch/SKILL.md" /> top-to-bottom and confirm the four phases match your mental model of the sync MVP. Phase 0.3 (worktree confirmation), Phase 1 (single unit), Phase 4 (four-option respond menu) are the meaningful behavioral changes.
Skim the new <ref_file file="/home/ubuntu/repos/compound-engineering-plugin/plugins/compound-engineering/skills/ce-dispatch/references/dispatch-prompt-template.md" /> and confirm <orientation>, <agent-identity>, <comment-protocol> carry the shape you want, and that the nine-step <ce-plugin> block prescribes the compound-engineering loop the way you want the in-workspace agent to walk it.
Spot-check the contract test rewrite: <ref_file file="/home/ubuntu/repos/compound-engineering-plugin/tests/skills/ce-dispatch-contract.test.ts" />. Confirm the not.toMatch guards for removed behavior (Parallel Safety Check, six options, dependencies key) and the new expect(...).toContain assertions for added behavior reflect the right shape.
Confirm the ce-plan routing wording change reads correctly: <ref_snippet file="/home/ubuntu/repos/compound-engineering-plugin/plugins/compound-engineering/skills/ce-plan/SKILL.md" lines="907-907" /> and <ref_snippet file="/home/ubuntu/repos/compound-engineering-plugin/plugins/compound-engineering/skills/ce-plan/references/plan-handoff.md" lines="67-67" />.

End-to-end test plan (manual, after merge)

Create a Conductor workspace from mvp/ce-dispatch-beta-rewrite for a small repo with a plan that has a single implementation unit. Note the worktree path.
From the orchestrator session: /ce-dispatch <plan-path>. Confirm Phase 0.3 asks for the worktree path; supply it.
Confirm Phase 1 surfaces the unit list and lets you pick one.
Confirm Phase 3 creates one GitHub issue with the rendered prompt as the body and the four sections (<orientation>, <agent-identity>, <comment-protocol>, plus the legacy XML).
From the Conductor workspace: tell the agent to read the issue. Walk the nine-step loop end-to-end (/ce-work -> /ce-code-review -> /ce-compound -> /ce-commit-push-pr).
Back in the orchestrator: re-invoke /ce-dispatch to enter the respond loop. Confirm "Review the PR" pulls the new PR; confirm "Reply to agent comment" surfaces the latest agent comment if you posted one; confirm "Mark unit complete" closes the issue once the PR is merged.

Notes

This PR does not delete or rename any directories. The skill remains at ce-dispatch/. No additions to STALE_SKILL_DIRS, STALE_AGENT_NAMES, STALE_PROMPT_FILES, or EXTRA_LEGACY_ARTIFACTS_BY_PLUGIN are needed.
The four architectural decisions you approved are all reflected: (1) single skill with Phase 4 respond menu; (2) dependencies: dropped from metadata footer; (3) branch mvp/ce-dispatch-beta-rewrite off main; (4) everything marked for removal in the plan was removed.
Drafted intentionally — pending your review before flipping to ready.
The pre-existing resolve-base.sh failure is environment-related (detached shallow checkout). I baselined it before any edits; not introduced by this PR.

Link to Devin session: https://app.devin.ai/sessions/56b17768c2ef4657ba155e2435bf1548
Requested by: @shubness

Rewrite ce-dispatch from multi-unit fan-out to a single-unit sync MVP. Each invocation creates one GitHub issue for one implementation unit; the orchestrator and the in-workspace agent coordinate via issue comments and the resulting PR; the user pings each side manually. Behavior changes: - Phase 0: drop dispatch_mode and dispatch_auto_review config keys; keep dispatch_branch_prefix, dispatch_base_branch, dispatch_labels. Add a Phase 0.3 step that asks the user for the worktree absolute path (the user creates the Conductor workspace before invoking the skill, which resolves the chicken-and-egg of needing a worktree before the issue exists). The worktree dirname becomes the agent name used in comments. - Phase 1: pick exactly one implementation unit. Drop dependency-graph construction, parallel-safety overlap checks, and unit coalescing. - Phase 2: render a single self-contained dispatch prompt. Populate three new template sections (<orientation>, <agent-identity>, <comment-protocol>) alongside the existing ones. Drop multi-unit coalescing branches. - Phase 3: create exactly one issue via gh. - Phase 4: collapse the six-option monitor loop to a four-option respond menu (Reply to agent comment, Review the PR, Mark unit complete, Done for now). Drop dependency-aware merge gating, the dependency-graph rendering, the auto re-dispatch path, and the auto-review path. Compose existing CE skills (ce-code-review, ce-resolve-pr-feedback) for the PR-feedback round-trip. Prompt template: - Add <orientation> for progressive context exposure (path list, not inlined content). - Add <agent-identity> carrying agent-name and worktree-path. - Add <comment-protocol> with the [<agent-name> -> orchestrator] timestamped format and an explicit STOP-after-asking directive that closes the airgap concern (the in-workspace agent must not make architectural decisions without orchestrator input). - Rework <ce-plugin> as an explicit nine-step compound-engineering loop: read orientation -> /ce-work -> implement and verify -> /ce-code-review -> /ce-compound (optional) -> /ce-commit-push-pr -> comment with PR URL -> stop and wait -> on ping run /ce-resolve-pr-feedback. Loop. - Drop dependencies: from the metadata footer; rename unit_ids: -> unit_id:; add agent_name: and worktree_path:. Configs: - Mirror the dropped/retained keys across .compound-engineering/ config.local.example.yaml and ce-setup/references/config-template.yaml. - Update the dispatch section header to describe single-unit sync MVP. ce-plan routing: - Update the inline routing description for the 'Dispatch to external agents' menu option in ce-plan SKILL.md and plan-handoff.md to reflect single-unit shape; drop fan-out / parallel-execution language. Menu label and position (option 4 of 5) are unchanged, and the platform skill-invocation primitive guidance is preserved. Conductor notes: - Light revisions to the metadata-comment description so it lists the single-unit-MVP keys and states that dependency-graph state is intentionally absent from the metadata. Contract tests (rewritten): - Drop tests for removed behavior (Parallel Safety Check, dependency graph in Phase 1; dependency check in Phase 3; six-option monitor loop / dependency-aware merge / show-dependency-graph in Phase 4; multi-unit unit_ids: / dependencies: in metadata footer). - Update tests for changed behavior (frontmatter shape with [BETA] triplet and 'single' in description, Phase 0 worktree confirmation, Phase 0 retained-vs-dropped config keys with anchored table-row regex to allow callouts mentioning the dropped names in prose, Phase 4 four-option respond menu, single-unit metadata footer). - Add tests for new behavior (<orientation>, <agent-identity>, <comment-protocol> sections; nine-step ce-plugin sequence; STOP-and- wait directive; comment-protocol prefix shape; ce-resolve-pr-feedback routing in Phase 4; ce-plan routing wording without fan-out language). Regression guards preserved verbatim from PR EveryInc#762: gh pr list invocations in Phase 4 require --state all (otherwise merged PRs are invisible to the respond loop); git symbolic-ref invocations for origin/HEAD require --short (otherwise the full ref path leaks into dispatch metadata). Validation: - Targeted contract test: 63/63 pass. - Full bun test: 1307/1308 pass; the single failing test (resolve-base.sh > resolves against origin/HEAD in a detached shallow checkout) is a pre-existing environmental issue unrelated to this change and was already failing on baseline main. - bun run release:validate: clean. Refs #1, #2, #3 (the prior beta of this skill); supersedes the multi-unit dispatch model in this fork's MVP track.

devin-ai-integration · 2026-05-04T21:28:55Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

devin-ai-integration

Devin Review found 1 potential issue.

View 4 additional findings in Devin Review.

devin-ai-integration · 2026-05-04T21:39:31Z

 - **Create Issue** — Detect the project tracker (`gh` for GitHub, `linear` for Linear) and create the issue from the plan file as described under "Issue Creation" in `references/plan-handoff.md`. After creation, display the issue URL and ask whether to proceed to `/ce-work` via the platform's blocking question tool.
 - **Open in Proof (web app) — review and comment to iterate with the agent** — Load the `ce-proof` skill in HITL-review mode with the plan file as `source file`, the plan title as `doc title`, identity `ai:compound-engineering` / `Compound Engineering`, and recommended next step `/ce-work`. Then follow the post-HITL resync logic in `references/plan-handoff.md`, which handles the four `ce-proof` return statuses, re-runs `ce-doc-review` after material edits, and falls back gracefully on upload failure.
- **Dispatch to external agents** — Invoke the `ce-dispatch` skill via the platform's skill-invocation primitive (`Skill` in Claude Code, `Skill` in Codex, the equivalent on Gemini/Pi), passing the plan path as the skill argument. Do not merely tell the user to type `/ce-dispatch` — fire the invocation now so the plan's implementation units fan out to GitHub issues in this session.
+- **Dispatch to external agents** — Invoke the `ce-dispatch` skill via the platform's skill-invocation primitive (`Skill` in Claude Code, `Skill` in Codex, the equivalent on Gemini/Pi), passing the plan path as the skill argument. Do not merely tell the user to type `/ce-dispatch` — fire the invocation now. The skill hands one implementation unit at a time off to a Conductor (or other issue-driven) workspace via a single GitHub issue; the orchestrator and agent coordinate sync via issue comments and the resulting PR.


🟡 Menu label still describes multi-unit fan-out but routing text was updated to single-unit

The PR updated the routing text for the "Dispatch to external agents" option to single-unit language (line 907: "hands one implementation unit at a time off...via a single GitHub issue"), but the corresponding menu label at plugins/compound-engineering/skills/ce-plan/SKILL.md:899 still says "Create GitHub issues for each implementation unit" (plural, implying fan-out of all units). The same stale label exists in plugins/compound-engineering/skills/ce-plan/references/plan-handoff.md:44. This inconsistency means the ce-plan agent presents users with a menu option whose description promises multi-unit dispatch while the actual behavior (routed through ce-dispatch) only creates one issue for one unit. The new test at tests/skills/ce-dispatch-contract.test.ts:528-546 only checks the routing bullet for fan-out language, not the menu label, so this drift isn't caught.

Prompt for agents

The menu option label at line 899 of ce-plan/SKILL.md says "Create GitHub issues for each implementation unit, ready for pickup by Conductor workspaces or other issue-driven agent workflows" which implies multi-unit fan-out. The same stale label exists at line 44 of plugins/compound-engineering/skills/ce-plan/references/plan-handoff.md. Both need updating to reflect single-unit MVP behavior (e.g., "Dispatch one implementation unit to a Conductor workspace via a single GitHub issue" or similar). The corresponding test at tests/skills/ce-dispatch-contract.test.ts:528-546 should also be extended to check the menu label text (match pattern like /4\.\s+\*\*Dispatch to external agents\*\*[^\n]+/) for plural/fan-out wording.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-05-05T10:10:57Z

Battle-tested this rewrite via Anthropic's skill-creator eval protocol before flipping out of draft. Evidence on the stacked PR #5 (mvp/ce-dispatch-evals → this branch).

Headline (Opus 4.7, ~$1.50 in API costs):

Configuration	Pass rate (24 expectations across 4 evals)
`with_skill`	100% (24/24) after iteration-2 assertion fix; 95% (23/24) raw iteration-1
`without_skill` baseline	51% (12/24) iter-1 / 56% (13/24) iter-2
delta	+44 to +49 percentage points

4 evals run, each with quantitative expectations (5–9 each):

1-happy-path-single-unit-dispatch — Phase 0–3 issue rendering (orientation, agent-identity, comment-protocol, ce-plugin block, metadata footer, single gh issue create): 9/9 vs. 4/9 baseline
2-phase-4-respond-review-pr — four-option menu + PR-review routing through /ce-code-review: 4/5 → 5/5 iter-2 vs. 3/5 → 4/5 baseline
3-phase-4-respond-reply-to-agent-comment — [orchestrator -> agent] <ISO 8601> reply prefix: 5/5 vs. 3/5 baseline
4-phase-4-respond-mark-unit-complete — gh issue close + worktree archival prompt + PR-merged verification gate: 5/5 vs. 2/5 baseline

Findings: No skill bugs. The skill correctly renders all six required prompt-template sections, scopes content from the right unit on a multi-unit plan, surfaces exactly four Phase 4 options (no six-option monitor menu, no dependency graph, no auto-review), routes review through /ce-code-review, uses --state all consistently, and verifies PR is MERGED before closing the issue. The one initial assertion failure was an over-prescriptive eval (caught by the grader's own eval_feedback); refining the assertion in iteration 2 gave a clean 100% pass.

See:

PR test(ce-dispatch): add skill-creator-style eval pack + battle-test results #5 description for full breakdown and reproducibility commands
evals/ce-dispatch-workspace/REPORT.md for the human-readable battle-test report
evals/ce-dispatch-workspace/iteration-1/benchmark.md for the iteration-1 aggregate
Spot-check any iteration-1/<eval-name>/with_skill/grading.json for per-expectation evidence

Caveat: the runner is single-shot Chat Completions, so the agent describes commands rather than executing them. Real gh issue create/gh issue close/multi-turn comment-protocol roundtrips remain the manual end-to-end test plan. The contract test in this PR (63/63 passing) covers the loader/template-render path.

Ready for review whenever you are.

devin-ai-integration Bot assigned shubness May 4, 2026

devin-ai-integration Bot commented May 4, 2026

View reviewed changes

devin-ai-integration Bot mentioned this pull request May 5, 2026

test(ce-dispatch): add skill-creator-style eval pack + battle-test results #5

Draft

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ce-dispatch): single-unit sync MVP rewrite#4

feat(ce-dispatch): single-unit sync MVP rewrite#4
devin-ai-integration[bot] wants to merge 1 commit intomainfrom
mvp/ce-dispatch-beta-rewrite

devin-ai-integration Bot commented May 4, 2026 •

edited

Loading

Uh oh!

devin-ai-integration Bot commented May 4, 2026

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

devin-ai-integration Bot May 4, 2026

Uh oh!

devin-ai-integration Bot commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

devin-ai-integration Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Behavior changes

Prompt template changes

Configs / ce-plan routing / conductor-notes

Regression guards preserved

Contract tests

Validation

Review & Testing Checklist for Human

End-to-end test plan (manual, after merge)

Notes

Uh oh!

devin-ai-integration Bot commented May 4, 2026

🤖 Devin AI Engineer

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

devin-ai-integration Bot commented May 4, 2026 •

edited

Loading