[CK_TILE] ABQuant New Preshuffle #3638

DDEle · 2026-01-23T09:43:23Z

Proposed changes

This PR updates the preshuffle format for ABQuant to align with old-ck and AITER implementations by modifying tensor distribution logic, adjusting warp GEMM configurations, and refactoring instance factory functions.

Modified tensor distribution calculations to support variable access patterns based on data type sizes
Refactored instance factory functions from explicit declarations to static lambda-based initialization
Corrected function naming from get_n_words_per_128b() to get_n_dwords_per_128b()

Checklist

Please put an x into the boxes that apply. You can also fill these out after creating the PR. If you're not sure, please don't hesitate to ask.

I have added tests relevant to the introduced functionality, and the unit tests are passing locally
I have added the test to REGRESSION_TESTS list defined at the top of CMakeLists.txt in tests/CMakeLists.txt, IF the test takes more than 30 seconds to run.
I have added inline documentation which enables the maintainers with understanding the motivation
I have removed the stale documentation which is no longer relevant after this pull request
(If this change is user-facing) I have added release notes which provide the end users with a brief summary of the improvement from this pull request
I have run clang-format on all changed files
Any dependent changes have been merged

…ffle

Copilot

Pull request overview

This PR updates the preshuffle format for ABQuant to align with old-ck and AITER implementations by modifying tensor distribution logic, adjusting warp GEMM configurations, and refactoring instance factory functions.

Changes:

Modified tensor distribution calculations to support variable access patterns based on data type sizes
Refactored instance factory functions from explicit declarations to static lambda-based initialization
Corrected function naming from get_n_words_per_128b() to get_n_dwords_per_128b()

Reviewed changes

Copilot reviewed 32 out of 32 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
gemm_wp_abquant_pipeline_ag_bg_cr_base_policy.hpp	Updated KBPerLoad calculation and warp GEMM configuration with NumAccess parameter
gemm_quant_kernel.hpp	Modified tensor view creation to use separate K split dimensions
wp_pipeline_agmem_bgmem_creg_base_policy.hpp	Added KAccess calculation and updated tile distribution encoding
gemm_universal_pipeline_ag_bg_cr_policy.hpp	Renamed LDS bank width calculation function
tensor_shuffle_utils.hpp	Refactored shuffle logic with new access pattern calculation
arch.hpp	Renamed function to better reflect it returns dword count
run_gemm_quant_example.inc	Simplified K_split calculation and fixed return values
gemm_utils.hpp	Added global kernel lookup table function
gemm_quant_*.cpp	Converted instance factories to static initialization pattern
CMakeLists.txt	Added compiler flag to suppress global constructor warnings

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

include/ck_tile/host/tensor_shuffle_utils.hpp

example/ck_tile/38_block_scale_gemm/run_gemm_quant_example.inc

DDEle added 7 commits January 23, 2026 01:35

Refactor

e232c17

Gemm quant improvement

662bacd

Change preshuffle

8a5aea4

Fix

4e24ca1

Merge branch 'develop' into abquant-new-preshuffle

080b35f

Merge remote-tracking branch 'origin/develop' into abquant-new-preshu…

1a45a05

…ffle

Fix grouped gemm ut

75c1ca0

DDEle marked this pull request as ready for review January 26, 2026 07:39

DDEle requested review from Snektron, ThomasNing, afagaj, andriy-ca, aosewski, asleepzzz, bartekxk, carlushuang, cgmillette, coderfeli, geyyer, illsilin, poyenc, qianfengz, shumway, tenpercent, vidyasagar-amd and vpietila-amd as code owners January 26, 2026 07:39

DDEle changed the title ~~Abquant new preshuffle~~ [CK_TILE] ABQuant New Preshuffle Jan 26, 2026

DDEle added 2 commits January 27, 2026 01:12

Fix

bd60936

Merge branch 'develop' into abquant-new-preshuffle

9724fbc

DDEle requested a review from Copilot January 27, 2026 08:48

Copilot AI reviewed Jan 27, 2026

View reviewed changes

include/ck_tile/host/tensor_shuffle_utils.hpp Show resolved Hide resolved

example/ck_tile/38_block_scale_gemm/run_gemm_quant_example.inc Show resolved Hide resolved

Merge branch 'develop' into abquant-new-preshuffle

1242a1a

DDEle enabled auto-merge (squash) January 28, 2026 07:46

ThomasNing approved these changes Jan 28, 2026

View reviewed changes

DDEle merged commit 8e3d84a into develop Jan 28, 2026
23 checks passed

DDEle deleted the abquant-new-preshuffle branch January 28, 2026 07:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CK_TILE] ABQuant New Preshuffle #3638

[CK_TILE] ABQuant New Preshuffle #3638

Uh oh!

DDEle commented Jan 23, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[CK_TILE] ABQuant New Preshuffle #3638

[CK_TILE] ABQuant New Preshuffle #3638

Uh oh!

Conversation

DDEle commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed changes

Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

DDEle commented Jan 23, 2026 •

edited

Loading