-
Notifications
You must be signed in to change notification settings - Fork 129
Description
Background
Delta-kernel-rs currently excludes partition columns from Parquet files, storing partition values only in the transaction log and directory paths. This follows the standard
Delta Lake protocol.
However, Apache Iceberg requires partition values to be written to both the path AND the Parquet file. To support IcebergCompat tables, we need to conditionally include
partition columns in files.
Current Behavior
In kernel/src/transaction/mod.rs, the generate_logical_to_physical() method always filters out partition columns:
// for now, we just pass through all the columns except partition columns.
// note this is _incorrect_ if table config deems we need partition columns.
let fields = schema
.fields()
.filter(|f| !partition_columns.contains(f.name()))
.map(|f| Expression::column([f.name()]));Proposed Solution
Check if IcebergCompatV1 or IcebergCompatV2 is enabled using the is_feature_enabled() API:
let table_config = self.read_snapshot.table_configuration();
let include_partition_columns =
table_config.is_feature_enabled(&TableFeature::IcebergCompatV1) ||
table_config.is_feature_enabled(&TableFeature::IcebergCompatV2);
let fields = schema
.fields()
.filter(|f| !partition_columns.contains(f.name()) || include_partition_columns)
.map(|f| Expression::column([f.name()]));When IcebergCompat enabled: Include partition columns in physical files
When IcebergCompat disabled: Keep current behavior (exclude partition columns)
Changes Required
- Update generate_logical_to_physical() method (line 510-524)
- Update physical schema creation (line 536-546)
- Add tests for both cases (IcebergCompat enabled/disabled)
Related
- Part of icebergCompat table features support #1125 (IcebergCompat support)
- Follow-up from feat!: error on surplus columns in output schema #1488 review (mentioned by @emkornfield @nicklan)
Notes
- The
is_feature_enabled()API was recently added but not yet used in production code - Both IcebergCompatV1 and V2 are currently marked NotSupported, but this fix is a step toward full support