feat/batch-changes: support codingAgent steps via src-cli#1325
Open
taras-yemets wants to merge 11 commits into
Open
feat/batch-changes: support codingAgent steps via src-cli#1325taras-yemets wants to merge 11 commits into
taras-yemets wants to merge 11 commits into
Conversation
Vendors lib/batches/codingagent + lib/batches/codex from sourcegraph/ sourcegraph, adds the v3 codingAgent step + image alias to the spec + schema, and rewrites step.Run via codingagent.RenderRunCommand when step.CodingAgent is set. SRC_BATCHES_MODEL_PROVIDER_TOKEN and SRC_BATCHES_JOB_ID are forwarded from src's own env into the user container so the agent CLI can reach the server's model-provider proxy. Demo/MVP for v1 codingAgent support. Not for merge as-is.
…nt port Apply review feedback from sourcegraph/sourcegraph#12503 (where the upstream codingAgent step lives) to the src-cli v1 port. The same lib/batches/codingagent surface is shared, so the same review applies here. Changes: - lib/batches/codex: switch to upstream-recommended curl-based install (codex/claude code both recommend it), pin codex CLI version, drop dependency on the agent CLI being pre-baked in the run image. ImageRequirements is now [curl, tar] so misconfigured images fail fast at step start. - lib/batches/codingagent/types: add Agent.InstallScript and InstallDir so each agent owns its own pinned install at a Sourcegraph-controlled path, ignoring whatever the user image ships. - lib/batches: add Step.MarshalJSON to canonicalize v3 image: into container: on the wire. Without this the prep-side cache key (which marshals Step to JSON) includes image while src-cli's executor side does not, producing silent cache misses on every v3 spec. - lib/batches: reject codingAgent + run in the same step at parse time instead of silently picking one. - internal/batches/executor: extract forwardCodingAgentEnv with a docstring spelling out what's forwarded and why. - Tests for forwardCodingAgentEnv and the v3 image/codingAgent parse paths. Test Plan: - go test ./internal/batches/executor/ (root module) - cd lib && go test ./batches/...
Three real follow-ups on top of the initial v1 port, all surfaced while porting feedback from sourcegraph/sourcegraph#12503: - Stop emitting SRC_BATCHES_MODEL_PROVIDER_TOKEN in src-cli logs. The forwarded token is now redacted in StepStarted UI metadata (JSON-lines) and in the 'full command' debug log, but still passed verbatim to the docker -e flags so the agent CLI inside the container can use it. Server-side RedactedValues stays as a backstop. - Reject v3 codingAgent steps that omit image: at parse time, instead of failing later in the executor with an empty-image error. - Harden the codex install script: extract into a mktemp dir and mv into the install path so a failed/retried install can't leave a half-written binary behind, and assert that the installed binary's --version contains pinnedVersion (catches a stale binary surviving from a prior failed install). Also adds tests for the new redaction helpers, Step.MarshalJSON canonicalization, and the new codingAgent-requires-image validation.
Drop test godocs that just paraphrase the test name, an inline block that duplicates a helper's own doc, and a single-line note restating a well-known Go idiom.
Sourcegraph now owns the wire-path/upstream-path mapping per agent type; src-cli only needs to construct its base URL as <modelProviderURL>/<agent-type> and let the agent CLI append its protocol's wire path (codex CLI is hardcoded to POST <base>/responses since wire_api=responses is now the only supported value, so the explicit -c model_providers.sourcegraph.wire_api line is also dropped).
- Replace internal-type-leaking error message with user-facing wording
in run_steps.go (codingAgent ModelProviderURL check).
- Align temp-file error wording in writeRunScriptFile with the rest of
the file ('writing to temporary file').
- Trim redundant assertions in TestRedactSensitiveEnv and
TestRedactSensitiveArgs.
- Tighten TestParseBatchSpec_v3_codingAgentRequiresImage to assert only
the actual error path.
- Extract v3SpecWith helper to deduplicate YAML scaffold across
batch_spec_test.go cases.
Adds a one-line note above the env-var / header constants pointing at the sibling block in sourcegraph/sourcegraph, since the two repos hold the contract independently. Splits InstallDir into its own const since it's src-cli-local and not part of the cross-repo contract.
The Step.MarshalJSON canonicalization and v3 image→container parse-time mirroring are not required for server-side codingAgent execution: the server canonicalizes container on the wire, so src-cli receives container-only JSON and reads step.Container directly. These changes hardened src-cli's local handling of v3 specs (preview / validate / apply paths) but were already broken pre-PR; restoring that is out of scope here. Drop them to keep this PR focused on the codingAgent feature.
The previous assertion round-tripped the rendered shell script through shellquote.Split and compared the last token to the expected prompt. That tokenizer does not understand POSIX shell comments, so the apostrophe in the install script's "can't" comment was read as an unterminated single-quoted string and swallowed everything to EOF as one token. Production /bin/sh handles the comment correctly; only the test was broken. Assert on the shell-quoted suffix instead.
burmudar
reviewed
May 28, 2026
|
|
||
| const ( | ||
| model = "gpt-5.4" | ||
| pinnedVersion = "0.134.0" |
Contributor
There was a problem hiding this comment.
I think these should be injected via WorkspaceInput.json
|
|
||
| # Stage in a temp dir and mv into place so a failed retry can't leave a half-written binary behind. | ||
| _url="https://github.com/openai/codex/releases/download/rust-v${_version}/codex-${_triple}.tar.gz" | ||
| _tmp=$(mktemp -d "${TMPDIR:-/tmp}/sg-codex.XXXXXX") |
| _version=%s | ||
|
|
||
| case "$(uname -m)" in | ||
| x86_64) _triple=x86_64-unknown-linux-musl ;; |
Contributor
There was a problem hiding this comment.
since src is the current runner, we know the architecture so I think this entire script can be made a lot simpler and we can fail a lot sooner than to fail inside the bash script.
I think a lot of what this script is doing can actually be done natively by src using go.
|
|
||
| var b strings.Builder | ||
| for _, binary := range a.ImageRequirements() { | ||
| b.WriteString(failIfMissing(a.Type(), binary)) |
Contributor
There was a problem hiding this comment.
we can just use exec.LookPath
| codex.Agent{}, | ||
| } { | ||
| if _, exists := out[a.Type()]; exists { | ||
| panic("duplicate codingagent agent for " + a.Type()) |
Contributor
There was a problem hiding this comment.
this is a programming error if there are double? we should also not panic
| SkippedSteps map[int]struct{} `json:"skippedSteps"` | ||
| // ModelProviderURL is the resolved proxy base URL for coding-agent | ||
| // steps; only set when the spec contains at least one codingAgent step. | ||
| ModelProviderURL string `json:"modelProviderURL,omitempty"` |
Contributor
There was a problem hiding this comment.
I wonder if we should make WorkspaceInputv3 ? Not for here but this is intended for use with Batches v3
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
v1 server-side batch changes run via
src-cli. v3 specs can declarecodingAgentsteps, so src-cli needs to know how to run them.This PR adds that support. It pairs with sourcegraph/sourcegraph#12503, which adds the server side for v1 (v2 is rejected at admission).
Architecture
lib/batches/codingagentturns acodingAgentstep into a shell script (lib/batches/codexprovides codex's install + invocation). The executor calls this renderer instead of treatingstep.runas a template, and writes the result into the run container as-is.SRC_BATCHES_MODEL_PROVIDER_TOKENandSRC_BATCHES_JOB_IDon the v1CliStep. src-cli forwards them into the user container so the agent CLI can authenticate to/.executors/model-provider/batches. The token is redacted from src-cli's UI and log output; the server'sRedactedValuesis a backstop.lib/.src-cli/lib/is a soft-fork ofsourcegraph/sourcegraph'slib/batches, wired viareplace github.com/sourcegraph/sourcegraph/lib => ./libingo.mod. The newcodingagent/codexpackages and the changes tobatch_spec.go/batch_spec_stringdata.gomirror sourcegraph#12503. src-cli's copy is what runs in the user container, so the duplication is on purpose, not drift.batch_spec_stringdata.gowas regenerated from the canonical schema.SRC_BATCHES_MODEL_PROVIDER_TOKEN,SRC_BATCHES_JOB_ID, andX-Sourcegraph-Job-IDare defined in both this repo and sourcegraph#12503. Sharing them would mean either untangling thelib/fork or extracting a new shared module — both much more work than keeping three short strings in sync. Both sides have comments pointing at the mirror.Test Plan
Verified end-to-end together with sourcegraph#12503; see the test plan and demo there.