[task] fix: fix bug for args.train.accelerator.fsdp_config.mixed_precision.enable#638
Conversation
There was a problem hiding this comment.
Code Review
This pull request updates various training scripts to use the mixed precision configuration object instead of a boolean flag. However, the review identifies a critical issue where the build_parallelize_model function expects a MixedPrecisionConfig object rather than a boolean, meaning the current implementation ignores the user's configuration. The reviewer suggests passing the full mixed_precision configuration object to ensure the settings are correctly applied.
| enable_full_shard=args.train.accelerator.fsdp_config.full_shard, | ||
| enable_reshard_after_forward=args.train.accelerator.fsdp_config.reshard_after_forward, | ||
| enable_mixed_precision=args.train.enable_mixed_precision, | ||
| enable_mixed_precision=args.train.accelerator.fsdp_config.mixed_precision.enable, |
There was a problem hiding this comment.
The build_parallelize_model function expects a mixed_precision parameter of type MixedPrecisionConfig, not a boolean enable_mixed_precision. Passing a boolean to a non-existent parameter name will cause the function to use its default MixedPrecisionConfig(enable=True), effectively ignoring the user's configuration for mixed precision.
| enable_mixed_precision=args.train.accelerator.fsdp_config.mixed_precision.enable, | |
| mixed_precision=args.train.accelerator.fsdp_config.mixed_precision, |
| enable_full_shard=args.train.accelerator.fsdp_config.full_shard, | ||
| enable_reshard_after_forward=args.train.accelerator.fsdp_config.reshard_after_forward, | ||
| enable_mixed_precision=args.train.enable_mixed_precision, | ||
| enable_mixed_precision=args.train.accelerator.fsdp_config.mixed_precision.enable, |
There was a problem hiding this comment.
The build_parallelize_model function expects a mixed_precision parameter of type MixedPrecisionConfig, not a boolean enable_mixed_precision. Passing a boolean to a non-existent parameter name will cause the function to use its default MixedPrecisionConfig(enable=True), effectively ignoring the user's configuration for mixed precision.
| enable_mixed_precision=args.train.accelerator.fsdp_config.mixed_precision.enable, | |
| mixed_precision=args.train.accelerator.fsdp_config.mixed_precision, |
| enable_full_shard=args.train.accelerator.fsdp_config.full_shard, | ||
| enable_reshard_after_forward=args.train.accelerator.fsdp_config.reshard_after_forward, | ||
| enable_mixed_precision=args.train.enable_mixed_precision, | ||
| enable_mixed_precision=args.train.accelerator.fsdp_config.mixed_precision.enable, |
There was a problem hiding this comment.
The build_parallelize_model function expects a mixed_precision parameter of type MixedPrecisionConfig, not a boolean enable_mixed_precision. Passing a boolean to a non-existent parameter name will cause the function to use its default MixedPrecisionConfig(enable=True), effectively ignoring the user's configuration for mixed precision.
| enable_mixed_precision=args.train.accelerator.fsdp_config.mixed_precision.enable, | |
| mixed_precision=args.train.accelerator.fsdp_config.mixed_precision, |
|
Hi, can you help fix lint error with |
I will do it immediately . |
done |
What does this PR do?
Checklist Before Starting
[{modules}] {type}: {description}format (see check_pr_title.yml for the full list of allowed modules and types)[BREAKING]— e.g.[BREAKING][parallel, model] feat: dynamic batchingTest
API and Usage Example
Design & Code Changes
Checklist Before Submitting
tasks/training scripts were moved or renamed: updateddocs/examples and verifiedpython3 scripts/ci/check_doc_task_paths.pypasses (also enforced by the Check doc task paths CI workflow)