Skip to content

Add PolarExpress Variant to Muon #1621

@MarcMachaczek

Description

@MarcMachaczek

This has already been suggested here #1602.

The PolarExpress (Amsel et al., 2025) is an optimal method to compute the polar decomposition of a matrix. The authors have demonstrated that their method shows consistent improvements over other methods when used with Muon, and addressed "finite-precision issues, making it practical to use in bfloat16".

In terms of implementation, it takes a similar form to Newton-Schultz but with iteration-dependent polynomial coefficients. Hence, it is a light-weight addition to Muon and fits nicely into the existing interface.

(I promise next time I will create the issue before submitting the PR)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions