Minimal multi-process support (fork/exec/waitpid/pipe)#816
Draft
Minimal multi-process support (fork/exec/waitpid/pipe)#816
Conversation
Add single-host multi-process support for the Linux userland platform, enabling piped command execution (e.g., echo hello | cat) within a single host process. All forks use vfork semantics: parent suspends while child runs in shared address space, child detaches to its own VA partition on exec. Key changes: - ProcessRegistry for process lifecycle (create, exit, waitpid) - AddressSpaceProvider trait + VA partition allocator (128x1TiB) - GlobalState/ProcessState split (per-process PageManager) - do_fork with vfork semantics and VforkDone futex signaling - Exec detach to new address space before loading new binary - Fork-aware FD close (Arc refcount prevents cross-process destruction) - FD cleanup on process exit for proper pipe EOF detection - vfork syscall handler, Wait4 syscall dispatch - PIE-only children (dynamic ELF load hint via pm.addr_min()) Tested with fork+exec+waitpid and pipe-between-two-children (echo|cat). No regressions in existing test suite.
- Fork-aware FD refcounting: fork_refcount on IndividualEntry tracks cross-process FD sharing; clone_for_fork creates independent OwnedFds - Cross-process signal mailbox: BTreeMap-based per-PID mailboxes for delivering signals (SIGCHLD, kill) between processes - Drain mailbox on return to userspace (prepare_to_run_guest, check_for_interrupt) — not just in waitpid - SIGPIPE delivery on EPIPE for write, writev, sendto, sendmsg - SignalState::clone_for_fork: deep-clone handlers, fresh shared_pending - siginfo_chld: correctly decodes wait_status for CLD_EXITED vs CLD_KILLED - Cross-process kill() routes to target's signal mailbox
Add raw_fds_matching_metadata() to RawDescriptorStorage to correctly resolve raw FD numbers (not slot indices) matching per-FD metadata. Called in sys_execve before detach_to_new_address_space.
ProcessRegistry::reparent() updates parent/child relationships and returns zombie status so caller can deliver SIGCHLD to new parent. Orphan handler in prepare_for_exit sends SIGCHLD to init for zombies.
- pgid/sid fields in ProcessContext, inherited by fork children - setpgid with self-or-child constraint - setsid checks process group leader (pgid == pid), not session leader - kill(0, sig) sends to own process group - kill(-pgid, sig) sends to specific process group - pids_in_group() collects running processes in a group
resolve_path_lookup() extracts PATH from envp, tries each directory, falls back to /usr/bin:/bin if PATH not set.
Returns ENOSYS with log_unsupported if do_fork is called from a multi-threaded process, preventing undefined behavior.
try_wait now handles pid==0 (own process group), pid<-1 (specific process group), in addition to existing pid>0 and pid==-1.
readv: check pipe FD once before the iov loop, then use a dedicated pipe read path that avoids the double-mutable-borrow of kernel_buffer that would occur inside run_on_raw_fd closures. writev: route pipe FDs through write_to_iovec inside run_on_raw_fd.
Add three integration tests: - test_kill_signal: vfork+exec sleeper, kill with SIGKILL, verify WIFSIGNALED - test_waitpid_wnohang: poll with WNOHANG until child exits - test_exec_path_lookup: execve bare name triggers PATH search in shim Fix: move PATH resolution before shebang resolution in sys_execve. Previously, resolve_shebang tried to open the bare name (e.g. 'exit_with') which failed with ENOENT before PATH lookup could run.
…ndling hardening - Add process count limit (128) to prevent fork bombs - Cap signal mailbox at 256 entries (drop oldest on overflow) - Return ECHILD for process group waits with no matching children - detach_to_new_address_space returns Result instead of panicking on VA exhaustion - PID overflow returns CreateProcessError instead of panicking - exit_process is idempotent (no assert on double-exit) - Fork failure cleanup: remove zombie registry entry after spawn failure - Store address_space_id on ProcessState for future VA partition reclamation - Defer VA partition reclamation (128 partitions sufficient for minimal version)
- Fix CRITICAL wait4 futex race: snapshot exit epoch BEFORE try_wait to prevent missed wakeups when child exits between check and block - Fix PID/TID namespace collision: advance_next_pid after thread creation, saturating_add for child_pid+1 overflow - Remove duplicate unused ProcessRegistry from LiteBox core - Remove dead exit_epoch field from RegistryInner - Fix siginfo_chld: use (wait_status & 0x7f)==0 instead of fragile trailing_zeros heuristic for normal exit detection - Replace unwrap/expect with Result propagation in fork punchthrough - Fix try_wait doc comment to match actual return type semantics
…k ordering - Fix HIGH: signal vfork_done before returning error when detach_to_new_address_space fails during exec, preventing parent hang - Fix MEDIUM: remove redundant close-on-exec pass in sys_execve (close_on_exec() already handles it; second pass could double-decrement fork_refcount) - Fix MEDIUM: add debug_assert that reaped zombie has no children in try_wait (children should have been reparented during exit_process) - Fix MEDIUM: use unsigned_abs() instead of (-t).cast_unsigned() in try_wait to avoid i32::MIN overflow - Fix MEDIUM: clone Arc from signal_mailboxes map and drop outer lock before acquiring per-mailbox lock to prevent nested lock acquisition - Fix LOW: simplify redundant match arm in try_wait (t > 0 case)
Fix clippy lints across litebox core, shim, and runner crates: - similar_names: allow on create_process and sys_setpgid - question_mark: use ? operator in reparent - match_wildcard_for_single_variants: explicit ProcessState::Running - collapsible_if: use let-chains (edition 2024) - unnecessary_map_or: use is_none_or - verbose_bit_mask: use trailing_zeros - items_after_statements: move const before let bindings - unnecessary boolean not: invert if/else branches - dead_code: allow on compile_static_pie (unused in loader test)
- Fix cargo fmt formatting in process.rs - Replace unstable if-let match guard with nested match (E0658 on SNP/LVBS) - Add stub AddressSpaceProvider impl for WindowsUserland platform
…, test PID mismatch - Use full path for AddressSpaceKind in Windows platform impl - Remove needless let bindings in net.rs and unix.rs (clippy::let_and_return) - Fix test_syscall_rewriter: override PID to 1 to match process registry - Fix cargo fmt formatting
… test PID - exit_process returns None instead of panicking when process not found - Override PID to 1 in Windows runner test helper (matches process registry) - Fixes test_stdio, test_syscall_rewriter, and Windows loader test panics
…nd unused clone_table - fork_refcount → process_refcount - on_dup → on_ref_added - ForkDecremented → SharedDecremented - clone_for_fork → clone_for_child_selective(Option<&[usize]>) None = inherit all (bulk), Some = selective (NT-style) - increment_fork_refcounts → increment_process_refcounts - Remove unused Descriptors::clone_table
Single Descriptors::clone_storage_for_child method combines FD storage cloning with refcount bookkeeping — impossible to misuse by forgetting to increment refcounts after cloning.
Single method on Descriptors handles cloning + refcount increment in one pass. Removes the intermediate clone_for_child_selective from RawDescriptorStorage — no need for two methods when there is one caller.
The method naturally belongs on the object being cloned (the per-process FD table), not on Descriptors. Takes &mut Descriptors as a parameter for refcount bookkeeping.
…dSubsystemEntry No subsystem overrides these hooks — they were all default no-ops. process_refcount on IndividualEntry handles cross-process sharing, and Arc::strong_count handles within-process dup sharing. The hooks added complexity without value; they can be re-added if a subsystem actually needs them.
|
🤖 SemverChecks 🤖 Click for details |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds minimal multi-process support to litebox's Linux userland platform within a single host process. All forks use vfork semantics (parent suspended, child runs in shared address space until exec or exit), and child processes exec into their own VA partitions.
Key features
do_clonedetects fork (!CLONE_THREAD), suspends parent viaVforkDonefutex, child shares parent's address space until execVaPartitionAllocator(128 partitions), loads PIE binary there-1), own process group (0), and specific process group (-pgid)fork_refcounton descriptor slots; pipe EOF detection works correctly across processeskill()to pid/pgidsetpgid,getpgid,getpgrp,setsidexecvesearches$PATHfor bare binary namesArchitecture
litebox/): Platform-genericProcessRegistry(parent/child tracking, exit status, process groups),AddressSpaceProvidertrait, fork-aware FD refcounting (fork_refcountonIndividualEntry)litebox_platform_linux_userland/):VaPartitionAllocator(128×1TiB partitions via bitmap)litebox_shim_linux/): POSIX syscall implementations, cross-process signal mailbox, vfork parking, exec detachKnown limitations (minimal version)
Testing
test_fork_exec_wait: vfork + exec + waitpid, verify exit statustest_pipe_fork: pipe between two exec'd children (echo hello | cat)test_kill_signal: cross-process kill + signal handlingtest_waitpid_wnohang: non-blocking wait with WNOHANGReview
3 rounds of 6-agent static analysis review (correctness ×2, security ×2, code quality ×2). All CRITICAL and HIGH issues fixed. See
/workspace/docs/litebox/impl/log.mdfor full review log.References
origin/wdcui/agent-sandbox-poc(single-host multi-process, entangled with other changes)