feat(apply): safety hardening — atomicity, locking, pnpm CoW, sidecars, Maven gate#80
feat(apply): safety hardening — atomicity, locking, pnpm CoW, sidecars, Maven gate#80Mikola Lysenko (mikolalysenko) wants to merge 4 commits into
Conversation
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
b148a97 to
15639fe
Compare
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
|
Update: pushed
Notable wires:
Local validation (all green): Shared helpers live in |
Adds five new modules to `socket-patch-core` and refactors `apply_file_patch` to compose them safely with #79's perm-preservation: - **`patch::apply_lock`** — cross-platform advisory file lock at `<.socket>/apply.lock` via `fs2`. Used by every mutating subcommand to serialize against concurrent socket-patch runs. - **`patch::cow`** — hardlink + symlink copy-on-write. Before patching, if `filepath` is a symlink into a content-addressed store (pnpm) or a regular file with `nlink > 1` (bazel mirrors, nix store overlays), give this project a private inode. The pnpm content store and every other project pointing at it stay byte-identical. - **`patch::sidecars`** — ecosystem-aware sidecar fixups dispatched from `apply_package_patch`. Cargo: rewrite `.cargo-checksum.json` with new SHA256s so `cargo build` accepts patched sources. NuGet: delete `.nupkg.metadata` (the documented "unknown" state vs. a stale `contentHash` that would flag tampering). PyPI / gem / Go: advisory-only — surface a one-line note about downstream tooling consequences. - **`crawlers::pkg_managers`** — path-based detector for the four Node.js layout flavors (npm / pnpm / yarn-classic / yarn-berry PnP). Apply uses this to refuse yarn-berry PnP (packages live in `.yarn/cache/*.zip`) and to surface a pnpm-detected note. - **`apply_file_patch` atomic rewrite** — two-phase commit: 1. Hash `patched_content` in memory; error out before any disk write if it doesn't match `expected_hash`. Removes the prior "wrote bytes, post-write verify failed, can't restore" window. 2. CoW the target if it's a shared inode. 3. Stage write to `<parent>/.socket-stage-<uuid>`, `sync_all()`, then `rename(stage, target)`. POSIX `rename(2)` is atomic — observers see either the old or new bytes, never a truncated half-write. Composes cleanly with #79's mode + uid/gid restore step which now operates on the post-rename inode. `ApplyResult` grows `sidecars_updated: Vec<String>` and `sidecar_advisory: Option<String>` so the CLI envelope can surface fixup outcomes. `fs2` and `tempfile` added to socket-patch-core dependencies. Two new tests pin the headline invariants: - `test_apply_file_patch_hash_mismatch_leaves_original_intact` — atomic-write contract: hash mismatch leaves target byte-identical AND no `.socket-stage-*` litter in parent dir. - `test_apply_file_patch_does_not_propagate_to_hardlinked_sibling` — the pnpm content-store invariant at the integration level. Plus 10 unit tests for cow + apply_lock and 13 for sidecars/* + 9 for pkg_managers. Assisted-by: Claude Code:claude-opus-4-7
Integrates the new socket-patch-core safety primitives into the CLI via the v3.0 unified `GlobalArgs` + `Envelope` patterns from #79. **`commands::lock_cli`** (new) — envelope-aware wrapper around `apply_lock::acquire`. Takes `Command` so the failure envelope's `command` field reflects which subcommand was blocked. On contention the binary emits `{status: "error", error: {code: "lock_held", ...}}` in JSON mode or a one-line stderr message otherwise, then exits 1. **Lock acquisition** added to `apply`, `rollback`, `repair`, `remove` immediately after the manifest existence check. `remove`'s outer lock spans the inner `rollback_patches` call (which deliberately does NOT acquire the lock so the composition doesn't self-deadlock). **Apply pkg-manager gating** — after the lock, `apply` runs `detect_npm_pkg_manager`: - `YarnBerryPnP` → emit `EnvelopeError("yarn_pnp_unsupported", ...)` pointing at `yarn patch` and exit 1. - `Pnpm` → surface a one-line stderr note. CoW handles the substantive safety work; this just tells the user the layout was understood. **Sidecar JSON via `event.details`** — `result_to_event` extends the Applied event with `details.sidecarsUpdated: string[]` and `details.sidecarAdvisory: string | null` when either is non-empty. Narrower JSON-envelope contract than first-class fields; consumers read `event.details.sidecarsUpdated` from JSON. **Maven + NuGet experimental runtime gates** in `ecosystem_dispatch.rs`. Even when compiled with `--features maven`/`nuget`, the crawlers refuse to dispatch unless the matching `SOCKET_EXPERIMENTAL_MAVEN=1`/`SOCKET_EXPERIMENTAL_NUGET=1` env var is set. Without it, surface a warning event and skip those PURLs. Reasoning: Maven patches corrupt jar sidecar checksums (sha1/md5); NuGet patches corrupt `.nupkg.sha512` signature sidecars that `dotnet restore` reads as tamper-evidence. `fs2` added to socket-patch-cli dev-dependencies for the lock e2e test (same crate the binary uses internally). Assisted-by: Claude Code:claude-opus-4-7
Adds four end-to-end integration test files exercising the safety
primitives through the binary, plus shared `tests/common/mod.rs`
helpers, plus two existing-test contract updates.
**Suites added (20 new tests):**
- `e2e_safety_lock.rs` (6 tests, non-ignored). Test holds the same
`.socket/apply.lock` the binary uses via `fs2` directly, then
spawns `socket-patch apply` and asserts the second process exits
with `error.code == "lock_held"`. Zero production-code hooks.
- `e2e_safety_yarn_pnp.rs` (5 tests, non-ignored). Yarn-berry PnP
markers (`.pnp.cjs`, `.pnp.loader.mjs`) trigger
`error.code == "yarn_pnp_unsupported"`. Negative control:
plain npm layout does NOT trigger the refusal.
- `e2e_safety_cargo_build.rs` (5 tests, `#[ignore]` + `--features
cargo`). Three synthetic-vendor tests:
1. Baseline `cargo check --offline --frozen` succeeds.
2. Negative control — mutating the source WITHOUT the sidecar
fixup makes cargo refuse with "checksum changed". Proves
cargo actually verifies, which is what makes the positive
test meaningful.
3. Sidecar fixup makes `cargo check` pass; `.cargo-checksum.json`
is rewritten and the `package` field is preserved.
4. JSON envelope contract: `.cargo-checksum.json` appears in
`event.details.sidecarsUpdated`.
Plus `traitobject_real_socket_patch_round_trip` — the cargo
layer-2+3 combined test: `cargo fetch traitobject@0.0.1` from
crates.io → `socket-patch get b15f2b7f-d5cb-43c9-b793-80f71682188f`
from patches-api.socket.dev → assert `.cargo-checksum.json`
rewritten + `cargo check` succeeds against the real, production
Socket patch.
- `e2e_safety_pnpm.rs` (4 tests, `#[ignore]`). Two projects share a
pnpm content store via `--config.package-import-method=hardlink`.
`socket-patch get` in project A patches A; project B + store
entry stay byte-identical. `pnpm install --frozen-lockfile` in B
afterwards does not revert A. Exercises CoW against a real pnpm
install rather than a hand-rolled hardlink.
**`tests/common/mod.rs`** — shared helpers (`binary`, `run`,
`assert_run_ok`, `git_sha256`, `sha256_hex`, `pnpm_run`, `cargo_run`,
`write_minimal_manifest`, `write_blob`, `parse_json_envelope`,
`envelope_error_code`, `envelope_error_message`) lifted from the
duplicated copies in `e2e_npm.rs` etc. Additive; existing suites
keep their inlined copies for now.
**CI matrix** in `.github/workflows/ci.yml`:
- `e2e_safety_cargo_build` on ubuntu + macos + windows
- `e2e_safety_pnpm` on ubuntu + macos + windows
(pnpm-on-Windows uses junctions + copies by default, so the CoW
invariant holds vacuously; the test still runs to verify apply
doesn't error on Windows. Semantic Windows nlink coverage is a
follow-up — `std::fs::Metadata` doesn't expose nlink on Windows
without `GetFileInformationByHandle` via `windows-sys`.)
- New `Setup pnpm` step (`npm install -g pnpm@10`) gated on the
pnpm suite. The fast non-ignored suites (`e2e_safety_lock`,
`e2e_safety_yarn_pnp`) run via the standard `test` job on all
three platforms.
**Existing-test contract updates** (these tests were pinning the
old, broken behavior; both still describe correct invariants —
their assertions just needed to track the rebased semantics):
- `tests/apply_invariants.rs`: `dir_hash` excludes `apply.lock`.
The lock file is deliberate ephemeral session state, not patch
content; the "apply is read-only against .socket/" invariant is
about manifest + blobs + diffs + packages.
- `tests/in_process_edge_cases.rs`:
`apply_blob_after_hash_mismatch_reports_failure` now asserts the
atomic-write contract — the target file is byte-identical to its
pre-call state on the hash-mismatch failure path, no half-written
corruption.
Assisted-by: Claude Code:claude-opus-4-7
e9187cc to
13cbfa7
Compare
|
Force-pushed — rebased onto upstream
Notable changes from the previous version of this PR:
Local validation (all green): Depscan PR SocketDev/depscan#20517 is now bumped to |
…+ advisory data
Replaces the previous `event.details.sidecarsUpdated` / `event.details.sidecarAdvisory`
free-form JSON bag with a typed, top-level `Envelope.sidecars[]` list.
## New types (`socket-patch-core/src/patch/sidecars/types.rs`)
pub struct SidecarRecord { purl, ecosystem, files, advisory }
pub struct SidecarFile { path, action: SidecarFileAction }
pub enum SidecarFileAction { Rewritten | Deleted | Created }
pub struct SidecarAdvisory { code, severity, message }
pub enum SidecarAdvisoryCode {
PypiRecordStale | GemBundleInstallReverts | GoModVerifyFails
| NugetSignedPackageTampered | SidecarFixupFailed
}
pub enum SidecarSeverity { Info | Warning | Error }
All derive `serde::Serialize`. Structs use camelCase; enums use
snake_case. Unit tests pin the JSON contract.
## JSON shape (consumer view)
```json
{
"command": "apply",
"events": [...],
"sidecars": [
{ "purl": "pkg:cargo/...", "ecosystem": "cargo",
"files": [{"path":".cargo-checksum.json","action":"rewritten"}] },
{ "purl": "pkg:nuget/...", "ecosystem": "nuget",
"files": [{"path":".nupkg.metadata","action":"deleted"}],
"advisory": { "code":"nuget_signed_package_tampered",
"severity":"warning", "message":"..." } }
]
}
```
- `sidecars` omitted from JSON when empty.
- `files` always present (possibly `[]` for advisory-only).
- `advisory` omitted when absent.
- `code` / `severity` are stable snake_case enum tags; `message`
is human text.
- `purl` joins to `events[].purl` for per-event context.
## Three real improvements over the old design
1. **No more lossy collapse.** NuGet's "deleted `.nupkg.metadata`
AND has a `.nupkg.sha512` signature" case now carries BOTH
a file entry AND an advisory. Before, the advisory was
silently lost when the file entry took its slot.
2. **Stable codes + severity.** Consumers (CI bots, dashboards,
telemetry, jq pipelines) can switch on `code` and route on
`severity` without regex-matching free-form strings.
3. **Decoupled from events.** Sidecar reporting is a top-level
`Envelope.sidecars` list. `PatchEvent.details` is no longer
mixed with `list` / `repair` / `remove`'s command-specific
bags — sidecar consumers have a typed schema all their own.
## Internal refactor
- `SidecarOutcome` removed. Per-ecosystem fixups return
`Result<Option<SidecarPayload>, SidecarError>` (internal
`SidecarPayload = { files, advisory }`); the dispatcher in
`sidecars/mod.rs` wraps the payload with PURL + ecosystem to
produce the `SidecarRecord`.
- `ApplyResult.sidecars_updated: Vec<String>` and
`sidecar_advisory: Option<String>` consolidated into a single
`sidecar: Option<SidecarRecord>` field.
- Apply CLI's `result_to_event` no longer attaches to
`event.details`; the run loop now calls
`env.record_sidecar(record.clone())` after each apply result.
- `Envelope` gains `sidecars: Vec<SidecarRecord>` field +
`record_sidecar` method.
- The error path (`SidecarError` returned by a fixup) is
converted at the apply boundary into a `SidecarRecord` with
`advisory.code = SidecarFixupFailed`, `severity = Error`.
Single uniform shape for consumers.
## Pre-existing test fixups
`in_process_remote_ecosystems_apply.rs` and `in_process_rollback_all_ecosystems.rs`
now set `SOCKET_EXPERIMENTAL_MAVEN=1` / `SOCKET_EXPERIMENTAL_NUGET=1`
when they explicitly exercise those paths. These were broken
silently by the Maven/NuGet runtime gates added in the prior
rebase (the gate was always there in commit 39a2321; tests just
happened not to exercise the maven/nuget paths to a depth where
the skip mattered).
## Test results
- cargo build --workspace --all-features: clean
- cargo build --release --workspace: clean (no warnings)
- cargo clippy --workspace --all-features -- -D warnings: clean
- cargo test --workspace --all-features: 1021 passed, 0 failed
- cargo test --features cargo --test e2e_safety_cargo_build --
--ignored: 5 passed (includes traitobject real-patch round trip)
The e2e cargo test `apply_reports_cargo_checksum_in_sidecars_updated`
tightened from a substring match to a structured-shape assertion
on `envelope.sidecars[].ecosystem=="cargo"` +
`files[].path=".cargo-checksum.json"` + `files[].action=="rewritten"`.
Assisted-by: Claude Code:claude-opus-4-7
|
Pushed What changed
Stable enum codes
Severity bucket: Sample consumer queries# All warning-level advisories across the run:
socket-patch apply --json | jq '.sidecars[] | select(.advisory.severity == "warning")'
# Packages that touched .cargo-checksum.json:
socket-patch apply --json | jq -r '.sidecars[] | select(.ecosystem == "cargo") | .purl'
# Count by advisory code (telemetry):
socket-patch apply --json | jq '[.sidecars[].advisory.code // "none"] | group_by(.) | map({code: .[0], count: length})'ValidationThe cargo e2e test Two pre-existing tests in Depscan PR SocketDev/depscan#20517 has been bumped to |
Summary
Hardens
socket-patch applyagainst the six failure classes surfaced by a recent audit. Each commit is self-contained, builds clean, and ships with tests.feat(apply): advisory file lock across mutating subcommands.socket/refactor(cli): plumb --api-url/--api-token without std::env::set_varstd::env::set_varunder tokio multi-thread runtimefeat(maven): gate Maven crawler behind SOCKET_EXPERIMENTAL_MAVEN=1<jar>.jar.sha1/md5); Maven patches today are experimentalfeat(apply): copy-on-write defense against pnpm content-store mutationnode_modules/<pkg>is a symlink/hardlink into the global store — patching project A used to mutate project B's installfeat(apply): detect pnpm + refuse on yarn-berry PnP.yarn/cache/*.zip; apply refuses with a clear pointer toyarn patchfeat(apply): atomic write via stage+renamefeat(apply): per-ecosystem sidecar fixups (cargo + nuget; advisories for pypi/gem/go)cargo build"checksum changed" failures; NuGet "tampered package" warnings; advisory events for ecosystems we cannot yet fully fixWhy this PR exists
socket-patch applyis run as the second step of the install → patch apply → build flow. Before this PR it had several silent failure modes that would either break the user's next build (Cargo checksum mismatch, NuGet content hash) or corrupt unrelated projects on the same machine (pnpm store mutation). The audit lays this out in detail; this PR closes the highest-blast-radius items.Confirmed in scope with project lead:
PatchFileInfoschema change is upstream of socket-patch).What stays out of scope
PatchFileInfo.yarn patch).contentHashrecompute (would need server-side support to ship the original.nupkg).go mod verifywould fail).RECORDrewriter (path-mapping between site-packages and the package dir has quirks I'd want to verify against real installs first — advisory only here).Tests
cargo build --workspace✓ (default features, what the release binary uses)cargo build --workspace --all-features✓cargo build --release --workspace✓ — no warningscargo clippy --workspace --all-features -- -D warnings✓cargo test --workspace --all-features✓ — 450 lib tests pass plus every integration suite (cli_parse_*,e2e_*).Notable new tests:
patch::cow::hardlink_is_broken_and_sibling_survives_mutation— the core pnpm invariant: patching one link target does not mutate the other.patch::apply::test_apply_file_patch_does_not_propagate_to_hardlinked_sibling— end-to-end variant proving the integration is wired.patch::apply::test_apply_file_patch_hash_mismatch_leaves_original_intact— atomic write contract: a failed apply leaves both the original file and the parent directory exactly as it was (no.socket-stage-*litter).patch::apply_lock::*— five tests covering acquire, contention, drop, missing dir, timeout.patch::sidecars::cargo::rewrites_only_patched_files+ sibling tests —.cargo-checksum.jsonround trip.patch::sidecars::nuget::*—.nupkg.metadatadeletion + signed-package advisory.crawlers::pkg_managers::*— 9 tests pinning the detection table.JSON envelope additions (additive)
The per-package apply result JSON now carries two new top-level keys:
sidecarsUpdated: string[]— paths relative to the package directory that were rewritten or deleted as part of the sidecar fixup. Empty when no sidecar applied.sidecarAdvisory: string | null— one-line operator advisory for ecosystems we cannot fully fix (PyPI RECORD, gem cache, go mod verify).nullwhen there is no advisory.Existing keys are unchanged. The contract test in
crates/socket-patch-cli/src/commands/apply.rswas updated to reflect the new top-level key set; downstream wrappers that strict-check the key set need a small update.Relationship to PR #79
This branch is off main at
b96a13fand is independent from PR #79 (feat/scan-apply-json, the v3.0 unified-args +scan --syncwork). They touch disjoint files and modules, so the merge should be conflict-free, but the depscan submodule bump (PR SocketDev/depscan#20517) will need a follow-up rebase once #79 lands so it picks up both code paths together.Manual smoke checks before merge
( socket-patch apply --json & ); socket-patch apply --json— second call emits{"errorCode":"lock_held"}.ln a b, run apply against a manifest covering one of them, verify the other is byte-unchanged.cargo buildsucceeds (no "checksum changed" error)..pnp.cjsproject, apply exits 1 witherrorCode: yarn_pnp_unsupported.--features mavenbutSOCKET_EXPERIMENTAL_MAVENunset, Maven PURLs surface the experimental warning and are skipped.Assisted-by: Claude Code:claude-opus-4-7