-
Notifications
You must be signed in to change notification settings - Fork 270
[CK_TILE] ABQuant New Preshuffle #3638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR updates the preshuffle format for ABQuant to align with old-ck and AITER implementations by modifying tensor distribution logic, adjusting warp GEMM configurations, and refactoring instance factory functions.
Changes:
- Modified tensor distribution calculations to support variable access patterns based on data type sizes
- Refactored instance factory functions from explicit declarations to static lambda-based initialization
- Corrected function naming from
get_n_words_per_128b()toget_n_dwords_per_128b()
Reviewed changes
Copilot reviewed 32 out of 32 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| gemm_wp_abquant_pipeline_ag_bg_cr_base_policy.hpp | Updated KBPerLoad calculation and warp GEMM configuration with NumAccess parameter |
| gemm_quant_kernel.hpp | Modified tensor view creation to use separate K split dimensions |
| wp_pipeline_agmem_bgmem_creg_base_policy.hpp | Added KAccess calculation and updated tile distribution encoding |
| gemm_universal_pipeline_ag_bg_cr_policy.hpp | Renamed LDS bank width calculation function |
| tensor_shuffle_utils.hpp | Refactored shuffle logic with new access pattern calculation |
| arch.hpp | Renamed function to better reflect it returns dword count |
| run_gemm_quant_example.inc | Simplified K_split calculation and fixed return values |
| gemm_utils.hpp | Added global kernel lookup table function |
| gemm_quant_*.cpp | Converted instance factories to static initialization pattern |
| CMakeLists.txt | Added compiler flag to suppress global constructor warnings |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Proposed changes
This PR updates the preshuffle format for ABQuant to align with old-ck and AITER implementations by modifying tensor distribution logic, adjusting warp GEMM configurations, and refactoring instance factory functions.
get_n_words_per_128b()toget_n_dwords_per_128b()Checklist
Please put an
xinto the boxes that apply. You can also fill these out after creating the PR. If you're not sure, please don't hesitate to ask.clang-formaton all changed files