Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[Nightly CI] Remove CT Model
#33530 opened Feb 2, 2026 by robertgshaw2-redhat Loading…
Triton MLA GQA perf fixes (4x improvement at 80k context) v1
#33529 opened Feb 2, 2026 by koush Loading…
3 of 5 tasks
Adds padding and perf improvements to wvSplitK_fp8 rocm Related to AMD ROCm
#33527 opened Feb 2, 2026 by amd-hhashemi Loading…
5 tasks
Update get_expert_mapping to include self parameter
#33525 opened Feb 1, 2026 by Otsutsukii Loading…
5 tasks
Fix mistral sliding window parsing
#33521 opened Feb 1, 2026 by andylolu2 Loading…
[Bugfix] Fix tool call streaming for gpt-oss/Harmony models bug Something isn't working frontend gpt-oss Related to GPT-OSS models
#33520 opened Feb 1, 2026 by alexbi29 Loading…
[Bugfix] Add SM110/SM120 device capability checks for NVFP4 MoE backends bug Something isn't working nvidia
#33516 opened Feb 1, 2026 by Code4me2 Loading…
3
7
[Bugfix] Fix gpt-oss chat format mismatch with HuggingFace bug Something isn't working frontend gpt-oss Related to GPT-OSS models
#33514 opened Feb 1, 2026 by thjung123 Loading…
5 tasks
fix(ROCm): Make flash_attn import optional in MLA attention rocm Related to AMD ROCm
#33511 opened Feb 1, 2026 by rabi Loading…
[Feature]: Qwen3-Next dual-stream execution in_proj_qkvz in_proj_ba qwen Related to Qwen models
#33505 opened Feb 1, 2026 by SouthWest7 Draft
1 of 5 tasks
Add **kwargs parameter to v1 FlashAttentionImpl as catch-all v1
#33504 opened Feb 1, 2026 by haojin2 Loading…
5 tasks done
[Experimental][Refactor] Refactor vision chunk modality processing for unification documentation Improvements or additions to documentation multi-modality Related to multi-modality (#4194) needs-rebase
#33498 opened Feb 1, 2026 by Isotr0py Draft
5 tasks
[Bugfix] Fix assertion error in flashmla backend with fullgraph enabled bug Something isn't working v1
#33496 opened Feb 1, 2026 by Kurumi5210 Loading…
5 tasks
[Feature] Support 'int8' in CacheConfig validation
#33495 opened Feb 1, 2026 by drshvik Loading…
[Doc]: update paths for Offline/Online/Others example sections documentation Improvements or additions to documentation
#33494 opened Feb 1, 2026 by soyr-redhat Loading…
4 of 5 tasks
Perf tuning and expansion of cases covered for wvSplitKrc rocm Related to AMD ROCm
#33493 opened Feb 1, 2026 by amd-hhashemi Loading…
5 tasks
Sort safetensors files to ensure deterministic loading order
#33491 opened Feb 1, 2026 by Lumosis Loading…
1 of 5 tasks
ProTip! Filter pull requests by the default branch with base:main.