feat(smp): track per-task CPU residency to avoid unnecessary cross-core TLB shootdowns#84
feat(smp): track per-task CPU residency to avoid unnecessary cross-core TLB shootdowns#84li041 wants to merge 1 commit intoStarry-OS:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds per-task CPU residency tracking to optimize TLB (Translation Lookaside Buffer) shootdown operations. By tracking which CPUs a process has run on, the system can avoid sending unnecessary cross-core TLB invalidation IPIs to CPUs that haven't executed the process.
Key changes:
- Added
AxCpuMasktracking toProcessDatato record CPU residency - Implemented
on_enter/on_leavehooks to update CPU mask when tasks are scheduled on/off CPUs - Added
on_cpu_mask()method to query which CPUs a task has run on
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
core/src/task.rs
Outdated
| let mut on_cpu_mask = self.proc_data.on_cpu_mask.write(); | ||
| on_cpu_mask.set(axhal::percpu::this_cpu_id(), false); |
There was a problem hiding this comment.
Race condition: Clearing the CPU bit in on_leave creates a race when multiple threads from the same process run concurrently. Since ProcessData is shared via Arc among all threads in a process, if Thread A from the process leaves CPU X while Thread B from the same process is still running (or about to run) on CPU X, the bit for CPU X will be incorrectly cleared. The comment on line 195 says "The CPUs on which the task has run" which suggests historical tracking, but the current implementation attempts to track "currently running" which requires more sophisticated synchronization or reference counting per CPU.
| /// The CPUs on which the task has run. | ||
| pub on_cpu_mask: RwLock<AxCpuMask>, |
There was a problem hiding this comment.
The documentation states "The CPUs on which the task has run" which implies historical tracking of all CPUs where the process has executed. However, the implementation in on_leave (line 157-158) clears CPU bits when leaving, suggesting it tracks "currently running CPUs" instead. This discrepancy between documentation and implementation needs to be resolved. Clarify whether this should track historical CPU usage (never clear bits) or current CPU presence (requires reference counting per CPU to handle multiple threads).
…re TLB shootdowns
This PR improves the SMP and IPI infrastructure and introduces per-task CPU residency
tracking to reduce unnecessary cross-core TLB shootdowns.
Background
Cross-core TLB flushes are expensive and should be avoided when possible.
Previously, the kernel lacked sufficient information to determine which CPUs
a task had actually run on, resulting in overly conservative TLB shootdowns.
In addition, the underlying IPI mechanism had several limitations that
prevented reliable cross-core synchronization.
What these PRs do
send_ipiwhere notifications could only target a single CPUResult
Related work