Skip to content

Support materializePartitionColumns in the create_table builder #2478

@sanujbasu

Description

@sanujbasu

Description

The kernel create_table builder currently rejects delta.feature.materializePartitionColumns=supported because MaterializePartitionColumns is intentionally not in ALLOWED_DELTA_FEATURES (kernel/src/transaction/builder/create_table.rs:50-78). Tests and connectors that need a materialized-partition table therefore have to hand-write the initial protocol+metadata commit as raw JSON.

This is purely a policy gate -- the feature itself is fully understood by kernel:

1/ MATERIALIZE_PARTITION_COLUMNS_INFO (kernel/src/table_features/mod.rs:434) marks kernel_support: KernelSupport::Supported, the strongest tier (no per-operation caveats).

2/ The write path is wired: Transaction::generate_logical_to_physical (kernel/src/transaction/mod.rs:837) skips the partition-column drop transform when the feature is on, and TableConfiguration::physical_write_schema (kernel/src/table_configuration.rs:465) returns the full schema (partition columns included) when the feature is on.

3/ The read path is correct: the scan reconstructs partition columns from add.partitionValues and ignores the redundant physical copy in the data file. Suboptimal but produces the right output.

4/ End-to-end write coverage already exists in kernel/tests/write/append.rs::test_partitioned_write_materialize (and siblings), via the test_utils::create_table JSON helper.

So the create-table allow-list is the only blocker.

Scope

1/ Allow-list addition (minimum). Add TableFeature::MaterializePartitionColumns to ALLOWED_DELTA_FEATURES so users can enable the feature via delta.feature.materializePartitionColumns=supported. The validation pipeline at create_table.rs:597 already maps this property to the right feature; only the gate is missing.

2/ First-class builder method (preferred). Property-string enablement is the bare-minimum API. A nicer surface mirrors clustering's with_data_layout pattern: extend DataLayout::Partitioned with a materialized: bool flag, route it through process_data_layout to add the writer feature, and have the builder reject combinations that don't make sense (e.g. clustering + materialize). The clustering precedent in ALLOWED_DELTA_FEATURES explicitly avoids the property-string path for exactly this reason -- the comment in source notes that clustering is "enabled by specifying clustering columns via with_data_layout()" rather than a feature signal.

3/ Test cleanup. Once the builder supports the feature, replace the raw-JSON setup in:

3a/ kernel/tests/write/partitioned.rs::test_partition_not_null_rejects_null_in_batch_materialized (added in #2476). Currently uses a local write_initial_commit_with_features helper that writes 0.json by hand. Switch to the kernel builder and delete the helper.

3b/ kernel/tests/write/append.rs::test_partitioned_write_materialize and any siblings. Currently use test_utils::create_table with vec![\"materializePartitionColumns\"] hard-coded. Switch to the kernel builder so the test exercises the same end-to-end path users would.

Tasks

  • Decide between (1) property-string-only or (2) first-class builder method
  • Add MaterializePartitionColumns to ALLOWED_DELTA_FEATURES (or wire via DataLayout if going with option 2)
  • (Option 2) Extend DataLayout::Partitioned with a materialized flag and route through process_data_layout
  • (Option 2) Validate incompatible combinations (clustering + materialize, materialize on non-partitioned)
  • Add positive and round-trip tests in kernel/tests/create_table/
  • Migrate test_partition_not_null_rejects_null_in_batch_materialized to the kernel builder and remove write_initial_commit_with_features
  • Migrate the materialize-feature tests in kernel/tests/write/append.rs to the kernel builder

Context

Companion to PR #2476 (NOT NULL write enforcement). PR #2476 introduces a materialized-partition NOT NULL test that hand-writes 0.json because of the allow-list gap; the doc comment on the helper explicitly references this gap. Closing this issue would let that test (and the existing append-side coverage) drop down to a uniform builder-based setup.

Related:

1/ #2465 -- the parent test-gap issue addressed by #2476.

2/ #2418 -- precedent: Invariants was added to ALLOWED_DELTA_FEATURES to allow delta.feature.invariants=supported while leaving the auto-enable path on non-null schemas as the primary trigger. Same pattern would apply here.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions