WSL2 triggers sustained 2000+ MB/s phantom StorPort read storms on NVMe, killing all WSL processes

# WSL2 triggers sustained 2000+ MB/s phantom StorPort read storms on NVMe, killing all WSL processes

Follow-up to #40284 which was closed before resolution. The issue is fully reproducible with kernel-level ETW evidence.

## Summary

WSL2 intermittently triggers extreme disk I/O saturation where physical disk reads reach 1,500–2,300 MB/s while all user-mode process I/O counters combined show less than 1 MB/s. The unattributed fraction is consistently 99.95–100%. Disk queue depths reach 24 and disk time exceeds 3,600%. This kills all WSL processes, VS Code remote connections, and terminal sessions.

## Environment

- **Windows**: 11 25H2, Build 26200.8037
- **WSL version**: Latest (kernel 6.6.87.2-microsoft-standard-WSL2)
- **Distro**: Ubuntu (WSL2)
- **Hardware**: MSI laptop, NVMe WD PC SN560 SDDPNQE-1T00-1032 (firmware 74116000)
- **VHDX**: 45GB on D: drive (150GB free), compacted
- **Sleep states**: S0 Modern Standby only (no S3)

## .wslconfig

```ini
[wsl2]
memory=10GB
swap=4GB
processors=12
localhostForwarding=true
guiApplications=false

[experimental]
autoMemoryReclaim=gradual
```

## Reproduction

1. Start Windows fresh (restart)
2. Open WSL (Ubuntu)
3. Run 1-2 terminal sessions with CLI workloads (e.g. Node.js processes)
4. Within 5-30 minutes, disk I/O saturates
5. Task Manager shows `vmmem` consuming 100% disk
6. All WSL processes are killed
7. `wsl --shutdown` immediately stops the storm

The issue reproduces approximately 30-50% of the time, typically within the first 15 minutes of a WSL session.

## Evidence

### Process-level monitoring

A custom PowerShell logger using `Get-Counter` captures physical disk metrics and per-process I/O rates every 10 seconds when disk exceeds 90%. Typical storm snapshot:

```
08:32:31 - Disk at 1321.5% (Queue: 14, ReadMB/s: 2050.05, WriteMB/s: 0.72)
--- Attribution Summary ---
CapturedProcessMB/s: 1.03
UnattributedMB/s: 2050
UnattributedFraction: 0.9996
```

All visible processes (System, VS Code, Chrome, Everything, Slack, vmwp.exe) collectively account for ~1 MB/s of the 2,050 MB/s physical reads. The remaining 99.96% is invisible to process-level performance counters.

### ETW/WPR kernel trace

A 45-second WPR trace captured automatically during a storm (GeneralProfile.light + DiskIO + FileIO) shows:

- 5,111,169 events, 0 lost — trace is valid
- 14,537 DiskIo ReadInit, 18,202 DiskIo Read, 96,590 FileIo Read events
- 29,753 StorPort request dispatches, 29,797 completions, 60,208 queue operations
- 718,496 stack walks

This confirms real kernel-level storage stack activity, not a counter artifact.

### Storm timeline (typical)

```
06:34:57 -  249.1% (Queue: 0,  Read: 824 MB/s)   — starts
06:35:11 -  241.3% (Queue: 2,  Read: 1047 MB/s)   — escalating
06:35:37 -  323.0% (Queue: 1,  Read: 1403 MB/s)   — ramping
06:35:51 - 1187.4% (Queue: 14, Read: 2164 MB/s)   — peak
06:36:17 - 1476.9% (Queue: 11, Read: 2316 MB/s)   — sustained
06:37:23 - 3649.7% (Queue: 24, Read: 1984 MB/s)   — peak queue
06:38:03 -  898.6% (Queue: 13, Read: 1860 MB/s)   — sustained
```

Storms last 2-10 minutes. Reads are nearly 100% of the I/O (writes stay near 0).

### File system filter drivers (clean)

`fltmc` output shows only standard Windows filters — no third-party drivers:

```
WdFilter (Windows Defender), bindflt, storqosflt, CldFlt, bfs, luafv, Wof, FileInfo
```

## What has been ruled out

1. **Third-party antivirus** — BitDefender fully uninstalled, Windows Defender exclusions added for `D:\wsl`, `\\wsl$`, `vmmem`, `vmwp.exe`
2. **Background downloaders** — Windows Update Delivery Optimization throttled, VS Code/VS auto-updates disabled
3. **Windows Update KB5083769** — uninstalled, storms persist
4. **WSLg** — disabled (`guiApplications=false`)
5. **WSL memory/swap config** — tested multiple configurations
6. **VHDX bloat** — compacted
7. **ext4 fragmentation inside VHDX** — defragmented all files (verified with `filefrag`), storms persist
8. **Disk health** — NVMe reports healthy, no read/write errors, Windows Event Logs clean
9. **Windows Search / Everything indexer** — present but showing 0 MB/s during storms

## Diagnostic logs available

I have the following ready to share via private link (not publicly, as the ETL contains file paths):

- **WSL diagnostic logs** collected using Microsoft's `collect-wsl-logs.ps1` with `-LogProfile storage` — in the exact format the WSL team expects
- **Multiple ETL traces** (600MB+) captured by WPR during active storms with DiskIO + FileIO + GeneralProfile.light profiles
- **Detailed disk I/O logs** with per-process attribution, PIDs, command lines, and physical-vs-process I/O gap analysis across multiple storms over 2 weeks

Happy to share any of these via a private channel, secure upload, or email with the investigating engineer.

## Conclusion

The storm is confirmed at the StorPort level in the host storage stack. Process-level counters cannot attribute the I/O because it originates below user-mode — likely in the Hyper-V virtual disk driver path (vhdmp.sys / storvsp.sys). `wsl --shutdown` immediately stops the storm, confirming WSL2's VM is the trigger. The NVMe (WD SN560) is a possible contributing factor but unconfirmed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WSL2 triggers sustained 2000+ MB/s phantom StorPort read storms on NVMe, killing all WSL processes #40420

WSL2 triggers sustained 2000+ MB/s phantom StorPort read storms on NVMe, killing all WSL processes

Summary

Environment

.wslconfig

Reproduction

Evidence

Process-level monitoring

ETW/WPR kernel trace

Storm timeline (typical)

File system filter drivers (clean)

What has been ruled out

Diagnostic logs available

Conclusion

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

WSL2 triggers sustained 2000+ MB/s phantom StorPort read storms on NVMe, killing all WSL processes #40420

Description

WSL2 triggers sustained 2000+ MB/s phantom StorPort read storms on NVMe, killing all WSL processes

Summary

Environment

.wslconfig

Reproduction

Evidence

Process-level monitoring

ETW/WPR kernel trace

Storm timeline (typical)

File system filter drivers (clean)

What has been ruled out

Diagnostic logs available

Conclusion

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions