Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[Feature]: IndexCache support for DSA models deepseek Related to DeepSeek models
#37735 opened Mar 21, 2026 by chaunceyjiang Draft
5 tasks
Support FP8 KVCache on XPU v1
#37731 opened Mar 21, 2026 by xinyu-intel Draft
5 tasks
[Bugfix] Preserve CUDA arch suffix (a/f) for SM12x — fixes NVFP4 NaN on desktop Blackwell bug Something isn't working ci/build nvidia ready ONLY add when PR is ready to merge/full CI is needed
#37725 opened Mar 20, 2026 by RobTand Loading…
2 tasks done
[ROCm][CI] Stabilize ROCm speech-to-text translation test with ROCM_EXTRA_ARGS ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm
#37723 opened Mar 20, 2026 by AndreasKaratzas Draft
quick fix for 37665
#37722 opened Mar 20, 2026 by xuechendi Loading…
5 tasks
[ROCm][CI] Update GSM8K eval config to use fp8-and-mixed models list (MI355) ci/build ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm
#37721 opened Mar 20, 2026 by AndreasKaratzas Loading…
[Test] Only Run MLA model when user explicitly set for batch invariance ready ONLY add when PR is ready to merge/full CI is needed v1
#37719 opened Mar 20, 2026 by yewentao256 Loading…
[Bug] Fix fp8 deepgemm batch invariant bug Something isn't working ready ONLY add when PR is ready to merge/full CI is needed
#37718 opened Mar 20, 2026 by yewentao256 Loading…
[ROCm][CI] Add large_gpu_mark to test_max_tokens_none for ROCm ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm
#37717 opened Mar 20, 2026 by AndreasKaratzas Loading…
[Bugfix] Mask padded rows for FlashInfer CUTLASS MoE bug Something isn't working nvidia
#37715 opened Mar 20, 2026 by kjiang249 Draft
5 tasks
Readability cleanup for wvSplitK reduces. rocm Related to AMD ROCm
#37713 opened Mar 20, 2026 by amd-hhashemi Loading…
5 tasks
Properly enable wvSplitK fp8 path for RDNA
#37712 opened Mar 20, 2026 by amd-hhashemi Loading…
5 tasks
[Bugfix] Fix structured output crash on CPU due to pin_memory=True bug Something isn't working ready ONLY add when PR is ready to merge/full CI is needed structured-output v1
#37706 opened Mar 20, 2026 by wjhrdy Loading…
3 tasks done
[Test] Add more unittests for CUDAGraphWrapper nvidia v1
#37702 opened Mar 20, 2026 by SoluMilken Loading…
3 of 5 tasks
[Bugfix] Fix FLA Hopper/TMA misclassification on SM12x desktop Blackwell bug Something isn't working
#37700 opened Mar 20, 2026 by RobTand Loading…
3 tasks done
[ROCm][Bugfix] fix exception related to trust_remote_code for MiniMax-M2.1-MXFP4 bug Something isn't working cpu Related to CPU backends rocm Related to AMD ROCm
#37698 opened Mar 20, 2026 by hongxiayang Loading…
5 tasks
[Perf] Use torch compile to fuse pack topk in trtllm moe nvidia performance Performance-related issues ready ONLY add when PR is ready to merge/full CI is needed
#37695 opened Mar 20, 2026 by wzhao18 Loading…
5 tasks
[FlexAttention] allow custom mask mod v1
#37692 opened Mar 20, 2026 by liangel-02 Loading…
[cpu][ci] remove soft-fail for Arm CI and add quant model tests ci/build cpu Related to CPU backends
#37691 opened Mar 20, 2026 by fadara01 Loading…
2 tasks
ProTip! Follow long discussions with comments:>50.