feat: copilot-cli provider (ACP) + nested git workspace support#293
Merged
feat: copilot-cli provider (ACP) + nested git workspace support#293
Conversation
Deploying agentv with
|
| Latest commit: |
100aa0c
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://abb5e87a.agentv.pages.dev |
| Branch Preview URL: | https://feat-workspace-lifecycle.agentv.pages.dev |
00b5edc to
4f57253
Compare
Replace setup/teardown with before_all/after_all/before_each/after_each lifecycle hooks (bun:test/Vitest naming). Shared workspace across tests in a suite with after_each reset. Remove workspaceFingerprint (YAGNI). Add cross-repo-sync showcase demonstrating the full workspace config surface with real ground truth diffs from agentevals commit history. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
4f57253 to
8286fcc
Compare
Add a new `copilot-cli` provider that spawns the Copilot CLI binary directly and communicates via the Agent Client Protocol (ACP), bypassing the @github/copilot-sdk's 60s session.idle timeout for long-running agents. - New `CopilotCliProvider` using @agentclientprotocol/sdk - `copilot-cli` is now its own ProviderKind (no longer aliases to copilot) - CopilotCliResolvedConfig with executable, model, args, cwd, etc. - Symbol-based log tracker matching copilot-sdk pattern Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Update file-changes.ts to use --submodule=diff for nested repo diffs - Add stageNestedRepoChanges() helper to stage files in child repos - Preserve .git in setup.ts so agents have full git history access - Add copilot-cli provider resolution tests in targets.test.ts - Update showcase targets.yaml to use copilot-cli provider - Add nested git repo tests with clean env to avoid GIT_DIR leaks Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… tracking - Extract resolvePlatformCliPath, buildLogFilename, sanitizeForFilename, formatElapsed, killProcess, CopilotStreamLogger, isLogStreamingDisabled into copilot-utils.ts shared by both copilot-cli.ts and copilot-sdk.ts - Fix timer leak in raceWithTimeout/sendWithTimeout: clear timeout in finally block so it doesn't fire after promise resolves (both providers) - Fix cost accumulation in copilot-cli: usage_update cost is now accumulated instead of overwritten across multiple events - Fix usage_update summarize event type (was 'usage', now 'usage_update') - Log warning when shared workspace overrides workers count to 1 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The copilot CLI binary does not support --system-prompt. Move system prompt delivery to the ACP prompt messages array instead. Also add azure_judge target to showcase for LLM judge testing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The rubric (checklist) and score-range prompt builders in both llm-judge.ts and llm-judge-prompt.ts were missing the file_changes section — only the freeform path included it. This meant rubrics evaluators could not see workspace diffs when grading agent output. Also adds examples/features/file-changes-judges/ — a minimal eval proving file_changes works with all three judge types: Azure LLM rubrics judge, built-in agent judge, and copilot-cli agent judge. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The evaluator is already type: agent_judge — having judge_target on a judge is semantically redundant. Rename to just target on the evaluator config surface. The target-level judge_target (in targets.yaml) is unchanged as it serves a different purpose there. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
copilot-cliprovider that spawnscopilot --acp --stdioand communicates via@agentclientprotocol/sdkNDJSON. Bypasses the@github/copilot-sdk60ssession.idletimeout, enabling long-running agent tasks. Follows the vibe-kanban ACP pattern.captureFileChanges()now usesgit diff --submodule=diffto expand nested repo changes into individual file diffs (instead of opaque gitlink hashes). Agents get fullgit log/git blameaccess in workspace templates.copilot-utils.tswith shared code betweencopilot-cliandcopilot-sdkproviders (~96 line reduction). Fixed timer resource leak (missingclearTimeoutinfinally) and cost accumulation bug (overwrite → accumulate) in both providers.--system-promptCLI flag (unsupported) to ACP prompt messages array.file_changesmissing from rubric/score-range LLM judge prompts:buildRubricPrompt()andbuildScoreRangePrompt()inllm-judge.ts(and counterparts inllm-judge-prompt.ts) were not including the[[ ## file_changes ## ]]section — only the freeform path did. Fixed all four functions.agent_judgeevaluatorjudge_target→target: The evaluator is alreadytype: agent_judge, sojudge_targetwas semantically redundant.judge_targetis now exclusively a target-level field (in targets.yaml), while evaluators usetargetto reference a configured provider.workers > 1is overridden to 1 for shared workspaces.examples/features/file-changes-judges/— minimal eval provingfile_changesworks with all three judge types (Azure LLM rubrics, built-in agent judge, copilot-cli agent judge), all scoring 1.0.Key files
packages/core/src/evaluation/providers/copilot-cli.tspackages/core/src/evaluation/providers/copilot-utils.tspackages/core/src/evaluation/providers/copilot-sdk.tspackages/core/src/evaluation/workspace/file-changes.ts--submodule=diff+stageNestedRepoChanges()packages/core/src/evaluation/evaluators/llm-judge.tsfile_changesin rubric/score-range promptspackages/core/src/evaluation/evaluators/llm-judge-prompt.tsfile_changesin checklist/score-range assemblypackages/core/src/evaluation/types.tsagent_judgeconfig:judge_target→targetpackages/core/src/evaluation/orchestrator.tstargetrenameexamples/features/file-changes-judges/examples/showcase/cross-repo-sync/Test plan
cross-repo-synceval with copilot-cli target (score 0.75, 297K chars file_changes)file-changes-judgeseval — Azure rubrics judge, built-in agent judge, copilot-cli agent judge all score 1.0[[ ## file_changes ## ]]section present in all judge prompts (rubrics, freeform, agent_judge built-in, agent_judge delegated)judge_targetreferences remain in any eval YAML — only in targets.yaml where it belongs🤖 Generated with Claude Code