Skip to content

Add atomic multi-file mode with write-ahead journal and recovery#19

Merged
smorin merged 3 commits intomainfrom
claude/atomic-groups-two-phase-commit-s3xqT
Mar 11, 2026
Merged

Add atomic multi-file mode with write-ahead journal and recovery#19
smorin merged 3 commits intomainfrom
claude/atomic-groups-two-phase-commit-s3xqT

Conversation

@smorin
Copy link
Owner

@smorin smorin commented Mar 11, 2026

Summary

This PR implements atomic multi-file operations for the toggle CLI, enabling all-or-nothing semantics across multiple file modifications. If the process is interrupted during the commit phase, a write-ahead journal allows recovery via rollback or forward completion.

Key Changes

  • New journal module (src/journal.rs): Write-ahead journal system that records the state of atomic batch operations

    • Journal and JournalEntry types track staged writes, backups, and per-file completion status
    • Two-phase commit states: Staged (temp files written) and Committing (renames in progress)
    • Recovery functions: recover_staged() (cleanup), recover_rollback() (restore from backups), recover_forward() (complete interrupted commit)
    • SHA-256 integrity verification of staged content
    • Atomic journal persistence with fsync guarantees
  • New platform module (src/platform.rs): Platform-specific file operation helpers

    • durable_sync(): Uses F_FULLFSYNC on macOS, sync_data() elsewhere for guaranteed persistence
    • sync_dir(): Fsyncs directory metadata on Unix (no-op on Windows)
    • rename_with_retry(): Retry logic for Windows file locking issues (antivirus, indexer)
    • resolve_symlinks(): Canonical path resolution
  • Atomic batch operations in io module (src/io.rs):

    • AtomicBatch struct manages two-phase commit lifecycle
    • stage() method writes content to temp files with fsync and permission preservation
    • commit() method creates hard-link backups, persists journal, then atomically renames all files
    • Lock file (LOCK_FILENAME) prevents concurrent atomic operations
    • Batch size warning for operations exceeding 500 files
  • CLI enhancements (src/cli.rs):

    • --atomic: Enable atomic multi-file mode
    • --no-backup: Disable backup creation (only valid with --atomic)
    • --recover: Recover from interrupted atomic operation
    • --recover-forward: Complete interrupted commit instead of rolling back
  • Main entry point updates (src/main.rs):

    • run_atomic() function: Computes all changes first, stages them, then commits atomically
    • compute_file_changes(): Dry-run version of file processing for pre-commit validation
    • Signal handler registration for graceful interrupt (SIGTERM, SIGINT)
    • Journal existence check prevents new operations if recovery is needed
    • CLI validation: --atomic incompatible with --dry-run, --no-backup requires --atomic, etc.
  • Integration tests (tests/integration.rs):

    • Happy path: multi-file atomic toggle with cleanup
    • No-op handling when no changes needed
    • Backup creation and cleanup verification
    • --no-backup warning
    • Flag combination validation
    • Section-based toggles in atomic mode
    • Recovery scenarios (staged journal, forward recovery)
    • Lock file blocking concurrent operations
    • Verbose output verification

Notable Implementation Details

  • Hard-link backups: Created before rename phase; deleted on success or restored on rollback
  • Graceful interruption: Signal handlers set interrupted flag; checked between renames
  • Durability guarantees: Journal persisted with fsync before entering Committing state; per-rename progress tracked with best-effort updates
  • Rollback safety: Completed renames restored in reverse order; incomplete renames left as temp files for manual recovery if backups unavailable
  • Lock mechanism: File-based advisory lock prevents concurrent atomic operations in same directory
  • Fallback journal location: Uses first target file's parent directory if CWD is not writable

https://claude.ai/code/session_01JTuqZjgo3pPPF5YUeGKarz

claude added 3 commits March 11, 2026 06:03
Implement --atomic flag for all-or-nothing multi-file operations using a
two-phase commit protocol with a JSON write-ahead journal. When enabled,
all file changes are first staged to temp files (phase 1), then atomically
renamed over originals (phase 2). If the process is interrupted, a
subsequent --recover run can roll back or --recover-forward to complete.

Key features:
- --atomic flag implies --backup via hard-links for safe rollback
- --no-backup opt-out with explicit warning about rollback limitations
- .toggle-atomic.journal WAL in CWD for crash recovery
- .toggle-atomic.lock advisory lock prevents concurrent --atomic runs
- Signal handling (SIGTERM/SIGINT) for graceful interrupt with journal preservation
- Platform helpers: macOS F_FULLFSYNC, Windows rename retry with backoff
- SHA-256 integrity verification for forward recovery
- into_temp_path() fd release pattern for large batch support (>500 files)
- CLI validation: --atomic rejects --dry-run, --no-backup requires --atomic
- 13 new integration tests covering happy path, recovery, validation, and edge cases

New files: src/journal.rs, src/platform.rs
New deps: sha2, signal-hook, fd-lock

https://claude.ai/code/session_01JTuqZjgo3pPPF5YUeGKarz
- Replace io::Error::new(io::ErrorKind::Other, ...) with io::Error::other()
- Remove identical if/else branches in stage() encoding check
- Prefix unused encoding parameter with underscore
- Apply cargo fmt formatting

https://claude.ai/code/session_01JTuqZjgo3pPPF5YUeGKarz
The platform.rs macOS cfg block uses libc::fcntl and libc::F_FULLFSYNC
but libc was missing from Cargo.toml, causing macOS CI builds to fail.

https://claude.ai/code/session_01JTuqZjgo3pPPF5YUeGKarz
@smorin smorin merged commit 88b558d into main Mar 11, 2026
13 checks passed
@smorin smorin deleted the claude/atomic-groups-two-phase-commit-s3xqT branch March 11, 2026 06:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants