[Feature Request] Support weightless RMSNorm (for FlashNorm weight folding trick)

Please add support for RMSNorm without normalization weights.

This is to support [FlashNorm](https://arxiv.org/abs/2407.09577) — a mathematically equivalent variant of RMSNorm that folds norm weights into the subsequent linear layer.  [See explainer video](https://www.youtube.com/watch?v=GEuJv34_XgU).

We have applied this weight folding trick to a few LLMs (Llama, Qwen, SMolLM) here:
https://huggingface.co/models?other=weightless-rmsnorm 


<img width="332" height="183" alt="Image" src="https://github.com/user-attachments/assets/d97b50ba-1092-4d44-ad70-ff2bca448b1d" />

### Motivation

FlashNorm's removal of norm weights reduces inference overhead at zero accuracy cost, and we'd like to share these optimized models with the broader community.

### Possible Implementation

Remove norm weights from your RMSNorm implementation. E.g., just skip norm weight multiplication if there are no norm weights provided.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Support weightless RMSNorm (for FlashNorm weight folding trick) #3200

Motivation

Possible Implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request] Support weightless RMSNorm (for FlashNorm weight folding trick) #3200

Description

Motivation

Possible Implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions