Fix/Improve warp_match_all#9192
Conversation
…es and use clearing padding builtin
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughSummary by CodeRabbit
important: WalkthroughConditional support for clearing struct padding was added: a builtin wrapper macro, warp_match_all now requires trivially-copyable types and either clears padding when available or requires bitwise comparability, tests added for a padded struct, and docs updated to record constraints. ChangesPadding-aware warp_match_all
Assessment against linked issues
important: WalkthroughConditional support for clearing struct padding was added: a builtin wrapper macro, warp_match_all now requires trivially-copyable types and either clears padding when available or requires bitwise comparability, tests added for a padded struct, and docs updated to record constraints. ChangesPadding-aware warp_match_all
Assessment against linked issues
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: afa37a68-ff4e-4abf-8cf2-d9104f987019
📒 Files selected for processing (3)
libcudacxx/include/cuda/__warp/warp_match_all.hlibcudacxx/include/cuda/std/__cccl/builtin.hlibcudacxx/test/libcudacxx/cuda/warp/warp_match.pass.cpp
This comment has been minimized.
This comment has been minimized.
|
discussed offline. Code updated |
| #endif // _CCCL_HAS_BUILTIN(__is_complete_type) | ||
|
|
||
| #if _CCCL_HAS_BUILTIN(__builtin_clear_padding) \ | ||
| && (_CCCL_HOST_COMPILATION() || !(_CCCL_COMPILER(GCC) || _CCCL_COMPILER(NVHPC))) |
There was a problem hiding this comment.
I don't think the condition is correct, shouldn't this be:
| && (_CCCL_HOST_COMPILATION() || !(_CCCL_COMPILER(GCC) || _CCCL_COMPILER(NVHPC))) | |
| && (_CCCL_HOST_COMPILATION() && !(_CCCL_COMPILER(GCC) || _CCCL_COMPILER(NVHPC))) |
There was a problem hiding this comment.
this is intentional.
- If we are on the host and the builtin is supported, then enable this path.
- If we are on the device and the builtin is supported, then enable this path except when the compiler is gcc or nvc++.
| # if defined(_CCCL_BUILTIN_CLEAR_PADDING) | ||
| auto __data_copy = __data; | ||
| _CCCL_BUILTIN_CLEAR_PADDING(&__data_copy); | ||
| const auto __data_ptr = ::cuda::std::addressof(__data_copy); | ||
| # else // ^^^ _CCCL_BUILTIN_CLEAR_PADDING ^^^ / vvv !_CCCL_BUILTIN_CLEAR_PADDING vvv | ||
| static_assert(is_bitwise_comparable_v<_Tp>, "data must be bitwise comparable"); | ||
| const auto __data_ptr = ::cuda::std::addressof(__data); | ||
| # endif // _CCCL_BUILTIN_CLEAR_PADDING | ||
| ::cuda::std::memcpy(__array, __data_ptr, sizeof(_Tp)); |
There was a problem hiding this comment.
Why don't you just clear the padding after memcpy?
There was a problem hiding this comment.
the padding could be within the argument. If we lose the original data types, it is not possible to understand where the padding is
🥳 CI Workflow Results🟩 Finished in 4h 21m: Pass: 100%/116 | Total: 4d 12h | Max: 4h 17m | Hits: 46%/1287479See results here. |
closes #6085
Description
cuda::device::warp_match_allrequires types that bitwise copyable, UB otherwise.The PR adds a compile-time check with the recently introduced trait and uses
__builtin_clear_padding(clang-cuda 23) to support more types.