Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion open_instruct/grpo_fast.py
Original file line number Diff line number Diff line change
Expand Up @@ -2493,7 +2493,7 @@ def maybe_evaluate(
):
"""Optionally evaluate the model."""
try:
# timeout 0.01 if this is the last training step or we're not evaluating
# timeout 0.01 if this is not the last training step or we're not evaluating
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Thanks for fixing the comment, it's much clearer now. To further improve readability, consider refactoring the conditional logic on line 2565. The current logic is a bit inverted from how one might naturally read the condition. By flipping the if/else, the code can more directly express the primary case for the long timeout.

Suggestion:

Change line 2565 to:

timeout = 100 if training_step >= args.num_training_steps and args.local_eval_every >= 0 else 0.01

This makes it explicit that the long timeout is used only on the last step when evaluation is enabled, which aligns nicely with the second line of the comment block.

# otherwise, wait to get the last evaluation generations (long timeout just in case)
timeout = 0.01 if (training_step < args.num_training_steps or args.local_eval_every < 0) else 100

Expand Down
Loading