Skip to content

Write partition column values to Parquet files when IcebergCompat is enabled #1505

@aleksandarskrbic

Description

@aleksandarskrbic

Background

Delta-kernel-rs currently excludes partition columns from Parquet files, storing partition values only in the transaction log and directory paths. This follows the standard
Delta Lake protocol.

However, Apache Iceberg requires partition values to be written to both the path AND the Parquet file. To support IcebergCompat tables, we need to conditionally include
partition columns in files.

Current Behavior

In kernel/src/transaction/mod.rs, the generate_logical_to_physical() method always filters out partition columns:

// for now, we just pass through all the columns except partition columns.
// note this is _incorrect_ if table config deems we need partition columns.
let fields = schema
    .fields()
    .filter(|f| !partition_columns.contains(f.name()))
    .map(|f| Expression::column([f.name()]));

Proposed Solution

Check if IcebergCompatV1 or IcebergCompatV2 is enabled using the is_feature_enabled() API:

  let table_config = self.read_snapshot.table_configuration();
  let include_partition_columns =
      table_config.is_feature_enabled(&TableFeature::IcebergCompatV1) ||
      table_config.is_feature_enabled(&TableFeature::IcebergCompatV2);

  let fields = schema
      .fields()
      .filter(|f| !partition_columns.contains(f.name()) || include_partition_columns)
      .map(|f| Expression::column([f.name()]));

When IcebergCompat enabled: Include partition columns in physical files
When IcebergCompat disabled: Keep current behavior (exclude partition columns)

Changes Required

  1. Update generate_logical_to_physical() method (line 510-524)
  2. Update physical schema creation (line 536-546)
  3. Add tests for both cases (IcebergCompat enabled/disabled)

Related

Notes

  • The is_feature_enabled() API was recently added but not yet used in production code
  • Both IcebergCompatV1 and V2 are currently marked NotSupported, but this fix is a step toward full support

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions