-
Notifications
You must be signed in to change notification settings - Fork 2.3k
[BugFix] Normalize add_files parquet stats to Iceberg logical bounds #69207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@codex review |
|
Codex Review: Didn't find any major issues. Keep them coming! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
3df1194 to
e11e91e
Compare
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e11e91e538
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (statValue instanceof Binary binary) { | ||
| return UUIDUtil.convert(binary.getBytes()); | ||
| } | ||
| if (statValue instanceof byte[] bytes) { | ||
| return UUIDUtil.convert(bytes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Catch invalid UUID stats before converting bounds
UUIDUtil.convert(...) throws when the stats payload is not exactly 16 bytes, but this branch does not handle that exception; in extractParquetMetrics that propagates and fails the entire ADD FILES operation instead of treating the column bounds as missing. This is reachable when importing parquet files whose UUID column stats are malformed or schema-mismatched (for example legacy BINARY data mapped to UUID), so the conversion should be guarded and return null to let missingStats cleanup drop only unreliable metrics.
Useful? React with 👍 / 👎.
[Java-Extensions Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
[FE Incremental Coverage Report]✅ pass : 119 / 119 (100.00%) file detail
|
[BE Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
|



Why I'm doing:
What I'm doing:
compare with Comparators.forType, and serialize bounds with Conversions.toByteBuffer.
Fixes #issue
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check: