Skip to content

[ROCm][Bugfix] fix exception related to trust_remote_code for MiniMax-M2.1-MXFP4#37698

Open
hongxiayang wants to merge 4 commits intovllm-project:mainfrom
hongxiayang:fix_quark_trust_remote
Open

[ROCm][Bugfix] fix exception related to trust_remote_code for MiniMax-M2.1-MXFP4#37698
hongxiayang wants to merge 4 commits intovllm-project:mainfrom
hongxiayang:fix_quark_trust_remote

Conversation

@hongxiayang
Copy link
Collaborator

@hongxiayang hongxiayang commented Mar 20, 2026

Purpose

Bug Fix: QuarkConfig.maybe_update_config

Problem: The original code called get_config() with hardcoded trust_remote_code=False for every Quark model. This caused:

  1. Exceptions for models like amd/MiniMax-M2.1-MXFP4 that require trust_remote_code=True
    For example:
Value error, The repository amd/MiniMax-M2.5-MXFP4 contains custom code which must be executed to correctly load the model. 
  1. Wasteful HF hub access for non-deepseek amd quark models where the logic doesn't even apply
  2. the user can not override the trust_remote_code as it is hard-coded

File Changes

vllm/model_executor/layers/quantization/quark/quark.py:

Replaced get_config() call with pre-loaded hf_config from ModelConfig, so no need to get from hf config. Also, user should be able to override trust_remote_code from command line.

Added early return for non-deepseek_v3 model types via _DEEPSEEK_V3_FAMILY_MODEL_TYPES frozenset.

vllm/model_executor/layers/quantization/base_config.py: Extended base maybe_update_config signature to accept revision + **kwargs

vllm.py: Passes hf_config, revision, and trust_remote_code from ModelConfig to maybe_update_config

This will allow user to specify trust_remote_code.

and other places to align with the signature change.

Added new Test

tests/quantization/test_quark_maybe_update_config.py: 3 tests using real HF configs — verifies amd/MiniMax-M2.1-MXFP4 stays False, amd/DeepSeek-R1-MXFP4-ASQ enables True, and missing hf_config doesn't crash

Test Result

root@node:/home/vllm/tests/quantization# pytest test_quark_maybe_update_config.py
==================================================== test session starts ====================================================
platform linux -- Python 3.12.12, pytest-9.0.2, pluggy-1.6.0
rootdir: /dockerx/vllm
configfile: pyproject.toml
plugins: asyncio-1.3.0, anyio-4.12.1
asyncio: mode=Mode.STRICT, debug=False, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collected 3 items

test_quark_maybe_update_config.py ... [100%]

=============================================== 3 passed, 2 warnings in 4.72s ===============================================
sys:1: DeprecationWarning: builtin type swigvarlink has no module attribute


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@mergify mergify bot added rocm Related to AMD ROCm cpu Related to CPU backends bug Something isn't working labels Mar 20, 2026
@github-project-automation github-project-automation bot moved this to Todo in AMD Mar 20, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses the bug related to trust_remote_code for Quark models by propagating the hf_config from ModelConfig down to maybe_update_config. This avoids re-fetching the configuration with a hardcoded trust_remote_code=False and also adds a performance improvement by skipping logic for non-applicable models. The signature changes across various quantization configs are correctly handled to maintain compatibility. I've added one comment regarding improving the robustness of dictionary access to prevent potential KeyError exceptions.

@mergify
Copy link

mergify bot commented Mar 20, 2026

Hi @hongxiayang, the pre-commit checks have failed. Please run:

uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?
mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

@hongxiayang hongxiayang changed the title [ROCm][Bugfix] fix exception related to trust_remote_code for certain amd quark models [ROCm][Bugfix] fix exception related to trust_remote_code for MiniMax-M2.1-MXFP4 Mar 20, 2026
@hongxiayang
Copy link
Collaborator Author

cc @dllehr-amd

Hongxia Yang added 4 commits March 20, 2026 18:19
Signed-off-by: Hongxia Yang <[email protected]>
Signed-off-by: Hongxia Yang <[email protected]>
Signed-off-by: Hongxia Yang <[email protected]>
@hongxiayang hongxiayang force-pushed the fix_quark_trust_remote branch from d86a7a3 to 30d7c71 Compare March 20, 2026 23:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working cpu Related to CPU backends rocm Related to AMD ROCm

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

1 participant