Skip to content

Fix JSON file corruption in save_payload_as_json when merge=True#97

Open
jasonwbarnett wants to merge 2 commits intobuildkite:mainfrom
altana-ai:fix/merge-json-write-outside-lock
Open

Fix JSON file corruption in save_payload_as_json when merge=True#97
jasonwbarnett wants to merge 2 commits intobuildkite:mainfrom
altana-ai:fix/merge-json-write-outside-lock

Conversation

@jasonwbarnett
Copy link
Contributor

Summary

  • Bug: In save_payload_as_json, when merge=True, the file read happens inside the FileLock but the file write happens outside it. This creates a race condition where multiple concurrent processes can read the file, then all try to write simultaneously, resulting in JSON corruption or lost data.
  • Impact: This affects any environment where multiple pytest processes merge results into the same JSON file concurrently — for example, when pants runs pytest in parallel across multiple targets and each process merges its results into a shared output file.
  • Fix: Move the file write inside the FileLock context manager when merge=True, ensuring the read-modify-write cycle is atomic. Non-merge writes don't need the lock and are handled in a separate else branch.

Before (buggy)

if merge:
    lock = FileLock(f"{path}.lock")
    with lock:
        if os.path.exists(path):
            with open(path, "r", encoding="utf-8") as f:
                existing_data = json.load(f)
            data = existing_data + data

with open(path, "w", encoding="utf-8") as f:  # <-- OUTSIDE the lock!
    json.dump(data, f)

After (fixed)

if merge:
    lock = FileLock(f"{path}.lock")
    with lock:
        if os.path.exists(path):
            with open(path, "r", encoding="utf-8") as f:
                existing_data = json.load(f)
            data = existing_data + data
        with open(path, "w", encoding="utf-8") as f:  # <-- INSIDE the lock
            json.dump(data, f)
else:
    with open(path, "w", encoding="utf-8") as f:
        json.dump(data, f)

Test plan

  • All 5 existing save_json tests pass (test_save_json_payload_without_merge, test_save_json_payload_with_merge, test_save_json_payload_with_non_existent_file, test_save_json_payload_with_invalid_file, test_save_json_payload_with_large_data)
  • Verify in a concurrent environment (e.g., pants with parallel pytest targets) that the merged JSON output is no longer corrupted

Jason Barnett and others added 2 commits March 19, 2026 14:22
The file write was happening outside the FileLock when merge=True,
causing JSON corruption when multiple pytest processes (e.g., from
pants running tests in parallel) try to merge results into the same
file concurrently. Both the read and write must be inside the lock
to ensure atomicity.

Move the write inside the lock block when merge=True, and use an
else branch for non-merge writes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Spawns 10 threads that simultaneously call save_payload_as_json with
merge=True on the same file. Verifies the result is valid JSON
containing all 10 entries.

This test fails with the buggy code (write outside lock) and passes
with the fix (write inside lock), confirming the race condition is
resolved.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant