Skip to content

[Feature Request] Support weightless RMSNorm (for FlashNorm weight folding trick) #3200

@NilsGraf

Description

@NilsGraf

Please add support for RMSNorm without normalization weights.

This is to support FlashNorm — a mathematically equivalent variant of RMSNorm that folds norm weights into the subsequent linear layer. See explainer video.

We have applied this weight folding trick to a few LLMs (Llama, Qwen, SMolLM) here:
https://huggingface.co/models?other=weightless-rmsnorm

Image

Motivation

FlashNorm's removal of norm weights reduces inference overhead at zero accuracy cost, and we'd like to share these optimized models with the broader community.

Possible Implementation

Remove norm weights from your RMSNorm implementation. E.g., just skip norm weight multiplication if there are no norm weights provided.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions