NVIDIA / TensorRT-LLM Public

Notifications You must be signed in to change notification settings
Fork 2.1k
Star 12.8k

Code
Issues 524
Pull requests 483
Discussions
Actions
Projects 2
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: NVIDIA/TensorRT-LLM

Labels 59 Milestones 1

New pull request New

483 Open 7,045 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[https://nvbugs/5846154][fix] Fix CuteDSL argmax on SM120

#11185 opened Feb 2, 2026 by syuoni

Loading…

1 task done

[None][test] Add DGX-Spark multinode perf cases

#11184 opened Feb 2, 2026 by JennyLiu-nv

Loading…

1 task done

[None][chore] AutoDeploy: Set nanov3 config to use flashinfer ssm

#11183 opened Feb 2, 2026 by galagam • Draft

1 task done

[https://nvbugs/5854860][fix] Fix cutedsl argmax on sm120

#11181 opened Feb 2, 2026 by dongfengy

Loading…

1 task done

[None][feat] Extract embeding as .savetensors and support float8 quantized model

#11180 opened Feb 2, 2026 by nvyocox • Draft

1 task done

[None][fix] Fix chat request bug for modality model Community want to contribute

PRs initiated from Community

Multimodal

Label for issues & PRs regarding Multimodal related objects

#11179 opened Feb 2, 2026 by Lihui-Gu

Loading…

1 task

[TRTLLM-10416][fix] Move DeepEPLowLatency test to machines that support IBGDA

#11178 opened Feb 2, 2026 by yuantailing

Loading…

1 task

[https://nvbugs/5810940][fix] Update lm_eval to 4.9.10 and re-enable Skip Softmax Attention tests on CI.

#11176 opened Feb 2, 2026 by bobboli

Loading…

1 task

[None][fix] Fallback to NCCL instead of NCCL symmetric

#11174 opened Feb 1, 2026 by Tabrizian • Draft

1 task

[None][feat] Enable joint optimization of agent applications and TensorRT-LLM with scaffolding Community want to contribute

PRs initiated from Community

#11173 opened Feb 1, 2026 by Boreas618

Loading…

1 task

[None][chore] Enable Nemotron Super nvfp4 tests

#11172 opened Feb 1, 2026 by tcherckez-nvidia

Loading…

1 task done

[None][chore] Remove closed bugs

#11171 opened Feb 1, 2026 by xinhe-nv

Loading…

[TRTRLLM-10807][feat] Enable the refactored Nemotron Super supporting.

#11169 opened Feb 1, 2026 by nv-guomingz

Loading…

1 task

[None][chore] Move test_trtllm_flashinfer_symbol_collision.py to tests/unittest/_torch

#11168 opened Feb 1, 2026 by yihwang-nv

Loading…

1 task

[None][fix] Remove duplicated MoE Computation with Helix CP+DP

#11167 opened Feb 1, 2026 by brb-nv

Loading…

1 task done

[https://nvbugs/5799917][fix] Recover from CUTLASS MoE doActivation perf regression for MXFP4/NVFP4 dtype

#11165 opened Jan 31, 2026 by rosenrodt

Loading…

1 task done

[None][feat] Remove the hard code for activation type definition in T…

#11164 opened Jan 31, 2026 by nv-guomingz

Loading…

1 task done

[https://nvbugs/5574553][fix] Unwaive tests

#11162 opened Jan 31, 2026 by hyukn

Loading…

1 task done

[feat][WIP] improve sharding time

#11161 opened Jan 31, 2026 by taylor-yb-lee • Draft

1 task

[https://nvbugs/5850094][fix] Fix MoE cost estimation for auto multi-stream scheduling

#11160 opened Jan 31, 2026 by yizhang-nv

Loading…

1 task done

[https://nvbugs/5435506][fix] Add optional InternVL2 any-res (dynamic tiling) image preprocessing Community want to contribute

PRs initiated from Community

#11159 opened Jan 31, 2026 by johnyang-nvidia

Loading…

1 task

[None][chore] Align LlmArgs with some Pydantic best practices

#11158 opened Jan 31, 2026 by anish-shanbhag • Draft

1 task done

[None][fix] Add an env var to turn off tinygemm

#11157 opened Jan 31, 2026 by dongfengy

Loading…

1 task done

[None][chore] Print memory usage before/after accuracy test in CI AutoDeploy

<NV> AutoDeploy Backend

#11155 opened Jan 30, 2026 by taylor-yb-lee

Loading…

1 task done

[None][feat] Add support for multi instances in Triton backend with pytorch backend

#11153 opened Jan 30, 2026 by achartier • Draft

1 task done

Previous 1 2 3 4 5 … 19 20 Next

Previous Next

ProTip! Type g i on any issue or pull request to go back to the issue listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!