[DO NOT MERGE] Refactor/aiter integration#76
[DO NOT MERGE] Refactor/aiter integration#76vllmellm wants to merge 581 commits intorefactor-fp8-linearfrom
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run You ask your reviewers to trigger select CI tests on top of Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add If you have any questions, please reach out to us on Slack at https://slack.vllm.ai. 🚀 |
| @@ -189,6 +189,8 @@ def process_weights_after_loading(self, layer) -> None: | |||
| if self.strategy == QuantizationStrategy.BLOCK: | |||
| maybe_post_process_fp8_weight_block(layer) | |||
|
|
|||
There was a problem hiding this comment.
Check if fp8_linear is initialised.
| N = w_q.shape[1] | ||
| K = w_q.shape[0] | ||
|
|
||
| if N % 16 == 0 and K % 16 == 0: |
There was a problem hiding this comment.
Add https://github.com/ROCm/vllm/blob/c88d6d2ec7299605bb2ed8a4aee9260d90ef0631/vllm/model_executor/layers/quantization/compressed_tensors/schemes/compressed_tensors_w8a8_fp8.py#L153 to the rocm_aiter_ops and use that to replace this if conditions.
…9189) Signed-off-by: nandan2003 <nandan.vallamdasu@outlook.com> Signed-off-by: Nandan Vallamdasu <nandan.vallamdasu@outlook.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
…llm-project#28832) Signed-off-by: Bram Wasti <bwasti@meta.com> Signed-off-by: Bram Wasti <bwasti@fb.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
vllm-project#29084) Signed-off-by: NickLucche <nlucches@redhat.com>
…llm-project#29216) Signed-off-by: Nick Hill <nhill@redhat.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
…oject#29232) Signed-off-by: zitian.zhao <zitian.zhao@tencentmusic.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
…h NEON (vllm-project#29193) Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>
…vllm-project#29239) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
…t#26966) Signed-off-by: bbartels <benjamin@bartels.dev>
Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: jiahanc <173873397+jiahanc@users.noreply.github.com> Signed-off-by: mgoin <mgoin64@gmail.com> Co-authored-by: mgoin <mgoin64@gmail.com>
…llm-project#29173) Signed-off-by: Michael Act <michael.a.c.tulenan@gdplabs.id> Co-authored-by: Michael Goin <mgoin64@gmail.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
…in test_multi_connector.py due to hipErrorLaunchFailure when calling .cpu() (vllm-project#29253) Signed-off-by: Randall Smith <ransmith@amd.com> Co-authored-by: Randall Smith <ransmith@amd.com>
…ntion.py (vllm-project#29252) Signed-off-by: Randall Smith <ransmith@amd.com> Co-authored-by: Randall Smith <ransmith@amd.com>
…st_pynccl.py (vllm-project#29119) Signed-off-by: Micah Williamson <micah.williamson@amd.com>
…istry (vllm-project#28958) Signed-off-by: Luke <yq0536@gmail.com> Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
…coding (vllm-project#29194) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
…-project#29276) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
… message history format (vllm-project#29249) Signed-off-by: joshiemoore <joshiemoore98@gmail.com>
…ct#29724) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
…9727) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
…ct#24722) Signed-off-by: Jinzhen Lin <jinzhen.ljz@antgroup.com> Signed-off-by: Michael Goin <mgoin64@gmail.com> Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com> Co-authored-by: Michael Goin <mgoin64@gmail.com> Co-authored-by: Michael Goin <mgoin@redhat.com>
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
…ject#29732) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Huamin Li <3ericli@gmail.com>
…gibberish output (vllm-project#28783) Signed-off-by: vensen <vensenmu@gmail.com> Co-authored-by: TJian <tunjian.tan@embeddedllm.com>
Signed-off-by: BowTen <bowten@qq.com>
…r` (vllm-project#29730) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
…project#29741) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
…end` (vllm-project#29234) Signed-off-by: ganyi <ygan@amd.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
…ect#29749) Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
…#29756) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: Shu Wang <shuw@nvidia.com> Signed-off-by: Shu Wang. <shuw@nvidia.com> Signed-off-by: Michael Goin <mgoin64@gmail.com> Co-authored-by: root <root@umbriel-b200-017.ipp4a1.colossus.nvidia.com> Co-authored-by: Michael Goin <mgoin64@gmail.com>
…ect#29568) Signed-off-by: Yifei Zhang <yifei.zhang1992@outlook.com>
Signed-off-by: Huamin Li <3ericli@gmail.com>
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: Daniel Salib <danielsalib@meta.com> Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
…t#29750) Signed-off-by: Mickael Seznec <mickael@mistral.ai> Co-authored-by: Roger Wang <hey@rogerw.io>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
…ect#29774) Signed-off-by: Fanli Lin <fanli.lin@intel.com>
… OOM (vllm-project#29504) Signed-off-by: zhxchen17 <zhxchen17@fb.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
…kMask building (vllm-project#26015) Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> Signed-off-by: baonudesifeizhai <baonudesifeizhai@gmail.com> Co-authored-by: baonudesifeizhai <baonudesifeizhai@gmail.com>
…9414) Signed-off-by: Marcin Ostrowski <marcinx.ostrowski@intel.com>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Purpose
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.