Avoid distro zombie state when wsl init dies in systemd mode#40433
Avoid distro zombie state when wsl init dies in systemd mode#40433chemwolf6922 wants to merge 7 commits intomasterfrom
Conversation
Co-authored-by: Copilot <copilot@github.com>
There was a problem hiding this comment.
Pull request overview
This PR aims to prevent WSL distributions from getting stuck in a “zombie” state when wsl-init dies while systemd mode is enabled, by introducing a lightweight watcher process that monitors wsl-init and triggers a PID-namespace teardown when it exits.
Changes:
- Adds a new
wsl-init-watcherentrypoint in the Linux init multicall binary and forks it whenBootInit(systemd mode) is used. - Implements watcher logic using
pidfd_open+poll()to detectwsl-initexit and then callsreboot(RB_POWER_OFF)to tear down the PID namespace. - Adds a Windows unit test that kills the presumed
wsl-initPID in systemd mode and verifies subsequent WSL commands succeed.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| test/windows/UnitTests.cpp | Adds a unit test to validate the distro recovers after wsl-init is killed in systemd mode. |
| src/shared/inc/lxinitshared.h | Introduces the LX_INIT_WSL_INIT_WATCHER name constant for the new entrypoint. |
| src/linux/init/init.cpp | Adds wsl-init-watcher dispatch, forks the watcher in systemd mode, and implements watcher behavior via pidfd monitoring and namespace teardown. |
Co-authored-by: Copilot <copilot@github.com>
| } No newline at end of file | ||
| } | ||
|
|
||
| int WslInitWatcher(int Argc, char** Argv) |
There was a problem hiding this comment.
Do we need a separate entrypoint for this, or could we simplify fork and run that logic inside the forked process ?
There was a problem hiding this comment.
This mainly is for getting a clean state with exec and avoid inheriting states from the init process. Running the logic directly after fork requires carefully managing the inherited stuff. Which may cause problems in the future.
Co-authored-by: Copilot <copilot@github.com>
…t-dies-in-systemd-mode
|
Hi @OneBlue , could you please take another look? Thanks. |
Summary of the Pull Request
This PR adds a wsl-init-watcher process to monitor the wsl init when systemd mode is enabled.
The mini_init process monitors the process 1 of a distro to determine if the distro is alive. In non-systemd mode, that is the wsl init. In systemd mode, that will be the systemd init. And if the wsl init dies in systemd mode, the distro ends up in a zombie state. Where all wsl calls from Windows fail with "Catastrophic failure".
This PR works by adding a new process that monitors the wsl init in systemd mode. And requests a shutdown if the wsl init process died.
PR Checklist
Detailed Description of the Pull Request / Additional comments
Validation Steps Performed
Add test: UnitTests::UnitTests::SystemdKillInitTerminatesDistro