Skip to content

Pull requests: Blaizzy/mlx-vlm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Attempt to fix Qwen VL generation
#1032 opened Apr 17, 2026 by pcuenca Contributor Loading…
Add KV cache quantization for continuous batching
#1030 opened Apr 17, 2026 by Blaizzy Owner Loading…
4 of 5 tasks
Add DFlash speculative decoding (single + batch + server)
#1029 opened Apr 16, 2026 by Blaizzy Owner Loading…
7 tasks done
Add vision feature caching to all models
#1028 opened Apr 16, 2026 by Blaizzy Owner Loading…
6 tasks done
Add Youtu-VL
#1018 opened Apr 13, 2026 by MollySophia Loading…
Feature: Add native Gemma 4 video support
#1017 opened Apr 12, 2026 by hybridherbst Loading…
fix: propagate the verbose to the Prefill tqdm
#1015 opened Apr 12, 2026 by PeterStaar-IBM Loading…
server: indicate finish reason properly when model made a tool call.
#1014 opened Apr 12, 2026 by viktike Contributor Loading…
fix: replace NaN from all-masked SDPA padding rows in Gemma 4 vision
#1006 opened Apr 10, 2026 by fabiopili Loading…
4 tasks done
feat: add logprobs support to /chat/completions
#994 opened Apr 9, 2026 by eloe Loading…
feat: add JSON mode via response_format parameter
#993 opened Apr 9, 2026 by eloe Loading…
feat: enforce tool_choice parameter in chat/completions
#992 opened Apr 9, 2026 by eloe Loading…
feat: add stop sequences support for both endpoints
#991 opened Apr 9, 2026 by eloe Loading…
feat: concurrency guard for Metal GPU serialization
#989 opened Apr 9, 2026 by eloe Loading…
Fix LoRA scaling: divide alpha by rank (#845)
#986 opened Apr 9, 2026 by H-A-Khan Loading…
ProTip! Exclude everything labeled bug with -label:bug.