[FlexAttention] allow custom mask mod by liangel-02 · Pull Request #37692 · vllm-project/vllm

liangel-02 · 2026-03-20T15:06:57Z

updating FlexAttention impl to accept custom mask mod from users

gemini-code-assist

Code Review

This pull request introduces a new block_sparsity_hint parameter to the FlexAttentionMetadata class and modifies the attention mechanism to allow for custom mask modifications. The changes aim to provide more flexibility in defining attention patterns, including support for custom sparsity hints. The code has been reviewed and a critical issue has been identified.

gemini-code-assist · 2026-03-20T15:11:17Z

vllm/v1/attention/backends/flex_attention.py

        # (causal mask for decoder or bidirectional mask for encoder)
-        if self.causal:
+        has_custom_mask = self.logical_mask_mod is not causal_mask_mod
+        if self.causal or has_custom_mask:


The condition self.causal or has_custom_mask will always evaluate to True if has_custom_mask is True. This means that the code will always use self.get_causal_mask_mod() when a custom mask is present, regardless of the value of self.causal. This might not be the intended behavior, as the user might want to use a bidirectional mask with a custom modification. This could lead to unexpected or incorrect attention patterns.

To fix this, the logic should ensure that self.causal is only considered when a custom mask is not present. If a custom mask is present, it should override the causal mask behavior.

Suggested change

if self.causal or has_custom_mask:

if has_custom_mask:

mask_mod = self.logical_mask_mod

elif self.causal:

mask_mod = self.get_causal_mask_mod()

else:

mask_mod = self.get_bidirectional_mask_mod()

Signed-off-by: Angel Li <[email protected]>

LucasWilkinson

@drisspg do you think you can help review this?

liangel-02 requested review from LucasWilkinson and MatthewBonanni as code owners March 20, 2026 15:06

mergify bot added the v1 label Mar 20, 2026

gemini-code-assist bot reviewed Mar 20, 2026

View reviewed changes

[FlexAttention] allow custom mask mod

86815a0

Signed-off-by: Angel Li <[email protected]>

liangel-02 force-pushed the flex branch from 190b9fb to 86815a0 Compare March 20, 2026 15:27

LucasWilkinson reviewed Mar 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FlexAttention] allow custom mask mod#37692

[FlexAttention] allow custom mask mod#37692
liangel-02 wants to merge 1 commit intovllm-project:mainfrom
liangel-02:flex

liangel-02 commented Mar 20, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 20, 2026

Uh oh!

LucasWilkinson left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-        if self.causal or has_custom_mask:
+        if has_custom_mask:
+            mask_mod = self.logical_mask_mod
+        elif self.causal:
+            mask_mod = self.get_causal_mask_mod()
+        else:
+            mask_mod = self.get_bidirectional_mask_mod()

Uh oh!

Conversation

liangel-02 commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

LucasWilkinson left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

liangel-02 commented Mar 20, 2026 •

edited

Loading