-
Notifications
You must be signed in to change notification settings - Fork 825
Fix executorch/extension/llm/custom_op/... #16665
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16665
Note: Links to docs will display an error until the docs builds have been completed. This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@mergennachin has exported this pull request. If you are a Meta employee, you can view the originating Diff in D90888544. |
This PR needs a
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR addresses test stability issues by adding threadpool configuration to prevent OMP errors and updating CUDA test requirements to specify minimum compute capability.
Changes:
- Added compute capability check for CUDA/Triton tests requiring CC >= 9.0
- Integrated threadpool reset workaround in test_sdpa_with_kv_cache.py to prevent OMP errors
- Updated TARGETS dependency to include portable_lib for test_sdpa_with_kv_cache
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| extension/llm/custom_ops/test_update_cross_attn_cache.py | Added CUDA compute capability check (CC >= 9.0) for Triton-based tests |
| extension/llm/custom_ops/test_sdpa_with_kv_cache.py | Added _unsafe_reset_threadpool call to prevent OMP threading errors |
| extension/llm/custom_ops/TARGETS | Added portable_lib dependency for test_sdpa_with_kv_cache to support threadpool import |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ], | ||
| deps = [ | ||
| "//caffe2:torch", | ||
| "//executorch/extension/pybindings:portable_lib", |
Copilot
AI
Jan 16, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The TARGETS file is missing a corresponding dependency update for test_update_cross_attn_cache. While test_update_cross_attn_cache.py doesn't currently use _unsafe_reset_threadpool, the file is importing from the same module family and may benefit from the same OMP error workaround that was applied to test_sdpa_with_kv_cache and test_quantized_sdpa. Consider adding the dependency if similar threading issues arise in the future.
0d5217c to
eea4ed6
Compare
|
@mergennachin has imported this pull request. If you are a Meta employee, you can view this in D90888544. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Differential Revision: D90888544 Pulled By: mergennachin
cab29b1 to
b7b7add
Compare
Differential Revision: D90888544 Pulled By: mergennachin
b7b7add to
9be6185
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| # Check if CUDA device has compatible compute capability for Triton kernels | ||
| # Minimum CC 9.0 (Hopper) required for current PyTorch/Triton build | ||
| CUDA_CC_COMPATIBLE = CUDA_AVAILABLE and torch.cuda.get_device_capability()[0] >= 9 |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The compute capability check may raise an IndexError when CUDA is available but no GPU device is present. Consider adding a device count check or wrapping in a try-except block. For example, the condition should handle the case where torch.cuda.device_count() is 0.
| CUDA_CC_COMPATIBLE = CUDA_AVAILABLE and torch.cuda.get_device_capability()[0] >= 9 | |
| CUDA_CC_COMPATIBLE = ( | |
| CUDA_AVAILABLE | |
| and torch.cuda.device_count() > 0 | |
| and torch.cuda.get_device_capability()[0] >= 9 | |
| ) |
Reviewed By: rascani
Differential Revision: D90888544