Skip to content

feat!: Parquet compression#1960

Open
emkornfield wants to merge 61 commits intodelta-io:mainfrom
emkornfield:parquet_compression
Open

feat!: Parquet compression#1960
emkornfield wants to merge 61 commits intodelta-io:mainfrom
emkornfield:parquet_compression

Conversation

@emkornfield
Copy link
Copy Markdown
Collaborator

@emkornfield emkornfield commented Feb 26, 2026

What changes are proposed in this pull request?

  • Add ParquetWriterConfiguration definition.
  • Add ability to parse ParquetWriterConfiguration from TableProperties
  • Add ability to let default engine accept ParquetWriterConfiguration.

Open question if the Configuration definition should be moved to TableProperties

How was this change tested?

Added unit tests

@emkornfield emkornfield requested a review from scovich March 12, 2026 17:56
@github-actions github-actions Bot deleted a comment from codecov Bot Mar 12, 2026
@github-actions github-actions Bot deleted a comment from scovich Mar 13, 2026
@github-actions github-actions Bot deleted a comment from emkornfield Mar 13, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 13, 2026

Codecov Report

❌ Patch coverage is 83.19328% with 20 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.48%. Comparing base (5958a40) to head (d4c104c).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
ffi/src/lib.rs 36.00% 16 Missing ⚠️
kernel/src/engine/default/parquet.rs 94.00% 1 Missing and 2 partials ⚠️
kernel/src/engine/sync/parquet.rs 93.33% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1960      +/-   ##
==========================================
- Coverage   88.49%   88.48%   -0.01%     
==========================================
  Files         179      179              
  Lines       59008    59100      +92     
  Branches    59008    59100      +92     
==========================================
+ Hits        52220    52296      +76     
- Misses       4786     4800      +14     
- Partials     2002     2004       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Collaborator

@scovich scovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach looks reasonable, but I think there was a bad merge that reinstated deleted code?

Comment on lines +230 to +233
#[rstest]
#[case("snappy", ParquetCompression::Snappy)]
#[case("SNAPPY", ParquetCompression::Snappy)]
#[case("zstd", ParquetCompression::Zstd)]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needed?

Suggested change
#[rstest]
#[case("snappy", ParquetCompression::Snappy)]
#[case("SNAPPY", ParquetCompression::Snappy)]
#[case("zstd", ParquetCompression::Zstd)]
#[rstest]
#[case("snappy", ParquetCompression::Snappy)]
#[case("Snappy", ParquetCompression::Snappy)]
#[case("SNAPPY", ParquetCompression::Snappy)]
#[case("zstd", ParquetCompression::Zstd)]

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added additional test cases.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These FFI test fixes seem unrelated?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, should be fixed.

Comment thread kernel/src/table_properties/mod.rs Outdated
#[strum(ascii_case_insensitive)]
pub enum ParquetCompression {
/// Snappy compression (default).
#[default]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we decide when zstd is default (above) vs. snappy (here)?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unified to zstd, it is what we merged into the RFC.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bad merge? I thought we reverted PathMode?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, should be fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking-change Public API change that could cause downstream compilation failures. Requires a major version bump.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants