Skip to content

dataset: pin validation shard #545

Open
KartikVashishta wants to merge 4 commits intokarpathy:masterfrom
KartikVashishta:fix/val-shard
Open

dataset: pin validation shard #545
KartikVashishta wants to merge 4 commits intokarpathy:masterfrom
KartikVashishta:fix/val-shard

Conversation

@KartikVashishta
Copy link
Contributor

Addresses #541 Validation shard is fixed so val loss/bpb is stable regardless of how many shards were downloaded.

Copy link
Collaborator

@svlandeg svlandeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I had a few comments, inline:

@svlandeg svlandeg added the bug Something isn't working label Feb 19, 2026
@svlandeg svlandeg added the waiting Waiting for user feedback/action label Feb 19, 2026
@svlandeg svlandeg linked an issue Feb 19, 2026 that may be closed by this pull request
@svlandeg svlandeg self-assigned this Feb 20, 2026
@svlandeg svlandeg removed the waiting Waiting for user feedback/action label Feb 20, 2026
Copy link
Collaborator

@svlandeg svlandeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the code diff and ran some quick tests with the branch of this PR and that all looks good, apart from a few nitpicking comments 😉

It makes sense to me to designate shard_01822.parquet as the standard validation one. The doc strings and UX messages have been updated accordingly.

@svlandeg svlandeg removed their assignment Feb 23, 2026
@KartikVashishta
Copy link
Contributor Author

Thanks for the review @svlandeg !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working suggest/merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

make validation loss reproducible

2 participants