Skip to content

Support parameter tags in optimizer weight decay#996

Open
dlwh wants to merge 3 commits intomainfrom
codex/implement-param_tags-feature-for-learning-rates
Open

Support parameter tags in optimizer weight decay#996
dlwh wants to merge 3 commits intomainfrom
codex/implement-param_tags-feature-for-learning-rates

Conversation

@dlwh
Copy link
Member

@dlwh dlwh commented Jun 17, 2025

Summary

  • allow OptimizerConfig to tag parameters via TagPattern
  • use parameter tags when computing weight decay mask
  • support tag-based learning rate configuration through dict learning_rate
  • implement scale_by_tagged_learning_rates to apply tag-based LRs
  • document tagging features in the configuration guide with an expanded section on tag matching
  • extend tests for tag priority and multiple tagged learning rates

Testing

  • pre-commit run --files docs/reference/Configuration.md src/levanter/optim/config.py tests/test_weight_decay_mask.py
  • pytest tests/test_weight_decay_mask.py -q

https://chatgpt.com/codex/tasks/task_e_685068f5119c8331ae1085b344c65a44

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant