Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! 此拉取请求引入了一项重要功能,允许用户在数据行级别定义损失计算策略,从而提供了对模型训练过程中损失权重的更精细控制。通过在数据集的每行中添加"loss_scale"字段,用户可以为不同的数据点指定独特的损失行为,例如仅计算最后一轮对话的损失或计算所有token的损失。这极大地增强了训练的灵活性,特别是在处理需要差异化损失处理的复杂数据集时。 Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a new feature allowing loss_scale strategies to be specified at the data row level, overriding global command-line parameters. This functionality is reflected in updated documentation across various files, including Customization/Architecture.md, Customization/Custom-dataset.md, Instruction/Agent-support.md, Instruction/Command-line-parameters.md, and Megatron-SWIFT/Command-line-parameters.md. Code changes include adding loss_scale to _DATA_ROW_KEYS in swift/dataset/preprocessor/core.py, preserving the loss_scale field in swift/dataset/utils.py, and modifying the CustomLossScale class in swift/loss_scale/base.py to handle per-row loss_scale overrides with fallback logic. A review comment suggests moving the from .mapping import get_loss_scale import statement from within a method to the module level in swift/loss_scale/base.py to improve performance and adhere to Python best practices.
swift/loss_scale/base.py
Outdated
| row_loss_scale = kwargs.get('loss_scale') | ||
| if row_loss_scale is not None: | ||
| # Use per-row loss_scale with higher priority than global setting | ||
| from .mapping import get_loss_scale |
There was a problem hiding this comment.
将 from .mapping import get_loss_scale 语句移动到文件顶部(模块级别)。在方法内部导入模块会导致每次调用该方法时都重新执行导入操作,这会降低性能并可能导致意外行为。将所有导入语句放在文件顶部是 Python 的最佳实践,可以提高代码的可读性和效率。
from swift.template import ContextType, Messages, get_last_user_round
from swift.utils import get_logger
from .utils import calculate_loss_scale
from .mapping import get_loss_scale
swift/dataset/utils.py
Outdated
| encoded = self.template.encode(row, return_length=True) | ||
| # Preserve loss_scale from data row if present (for per-row loss_scale strategy) | ||
| if 'loss_scale' in row: | ||
| encoded['loss_scale'] = row['loss_scale'] |
There was a problem hiding this comment.
Can this be implemented by modifying here?
ms-swift/swift/template/template_inputs.py
Lines 57 to 62 in 9092bc5
There was a problem hiding this comment.
Can this be implemented by modifying here?
ms-swift/swift/template/template_inputs.py
Lines 57 to 62 in 9092bc5
这部分代码不影响,已还原
| - List[float]: Loss scale values corresponding one-to-one with the | ||
| returned context list | ||
| """ | ||
| # Check for per-row loss_scale override in kwargs (from data row) |
There was a problem hiding this comment.
Is it possible to use different loss_scale in the template?
ms-swift/swift/template/base.py
Line 140 in 9092bc5
There was a problem hiding this comment.
Is it possible to use different loss_scale in the template?
ms-swift/swift/template/base.py
Line 140 in 9092bc5
是的,数据中的loss_scale可以传入
ms-swift/swift/template/base.py
Lines 1233 to 1234 in 9092bc5
支持不同数据使用不同的 loss 计算策略
通过数据行级别的
"loss_scale"字段为每行数据指定不同的 loss 计算策略