-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Pull requests: NVIDIA/Megatron-LM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix: don't enter branch if mtp_num_layers == 0
community-request
#2581
opened Dec 6, 2025 by
rj42
Loading…
6 tasks
Synchronize total block count across pipeline parallel ranks
#2578
opened Dec 5, 2025 by
santhnm2
Loading…
6 tasks
fix: ckpt loading failed because of padding metadata in dist optimizer
Expert Review
Apply this label to indicate that your PR is ready for expert review.
#2576
opened Dec 5, 2025 by
yaoyu-33
Loading…
6 tasks
[Megatron-FSDP] Support both old and new DeviceMesh APIs.
Expert Review
Apply this label to indicate that your PR is ready for expert review.
Add initial support for Kimi Delta Attention (Feature Request #2446)
community-request
#2573
opened Dec 5, 2025 by
CodersAcademy006
Loading…
draft: partial cudagraph scopes and improvements for training
#2572
opened Dec 5, 2025 by
jiemingz
Loading…
6 tasks
Fix world size mismatch causing distributed init deadlock (Issue #2458)
community-request
#2571
opened Dec 5, 2025 by
CodersAcademy006
Loading…
Fix: Ensure token IDs respect vocab_size in dataset, embeddings, and …
community-request
#2570
opened Dec 5, 2025 by
CodersAcademy006
Loading…
Add offset method for slow tokenizer
community-request
#2567
opened Dec 5, 2025 by
cael-ling
Loading…
6 tasks
[Draft] Modaltiy bridge for Modality-Decoupled Parallelism(MDP)
#2500
opened Dec 4, 2025 by
shifangx
Loading…
6 tasks
Previous Next
ProTip!
no:milestone will show everything without a milestone.