Skip to content

feat: allow materializePartitionColumns feature signal in CREATE TABLE#2481

Merged
sanujbasu merged 1 commit intodelta-io:mainfrom
sanujbasu:materialize-partition-columns-create-table
Apr 28, 2026
Merged

feat: allow materializePartitionColumns feature signal in CREATE TABLE#2481
sanujbasu merged 1 commit intodelta-io:mainfrom
sanujbasu:materialize-partition-columns-create-table

Conversation

@sanujbasu
Copy link
Copy Markdown
Collaborator

@sanujbasu sanujbasu commented Apr 28, 2026

What changes are proposed in this pull request?

Adds TableFeature::MaterializePartitionColumns to ALLOWED_DELTA_FEATURES.

The feature is already fully understood by kernel: KernelSupport::Supported, the write path keeps
partition columns in data files when the feature is on (Transaction::generate_logical_to_physical and TableConfiguration::physical_write_schema), and the read path is correct (scan reconstructs partition columns
from add.partitionValues regardless of whether they are materialized in the file).

The remaining gap was the create_table builder allow-list, which rejected delta.feature.materializePartitionColumns=supported at build time.

There is no delta.* enablement property for materializePartitionColumns in kernel today (MATERIALIZE_PARTITION_COLUMNS_INFO is EnablementCheck::AlwaysIfSupported), so the
only opt-in at create time is the explicit feature signal.

How was this change tested?

1/ New unit test test_create_table_with_materialize_partition_columns_feature_signal_allowed in kernel/tests/create_table/partitioned.rs. asserts that the feature signal lands in writerFeatures exactly once, does not appear in readerFeatures (writer-only), and is_feature_enabled returns true.

2/ Local verification:

  • cargo +nightly fmt --check
  • cargo clippy --workspace --benches --tests --all-features -- -D warnings
  • cargo doc --workspace --all-features --no-deps
  • cargo test -p delta_kernel --all-features for affected paths

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 28, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 88.43%. Comparing base (8e4c296) to head (66b5a42).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2481   +/-   ##
=======================================
  Coverage   88.43%   88.43%           
=======================================
  Files         178      178           
  Lines       58417    58417           
  Branches    58417    58417           
=======================================
  Hits        51660    51660           
  Misses       4787     4787           
  Partials     1970     1970           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

sanujbasu added a commit to sanujbasu/delta-kernel-rs that referenced this pull request Apr 28, 2026
## What changes are proposed in this pull request?

Fixes delta-io#2465.

Closes the write-time enforcement gap left by delta-io#2418, which auto-enables the invariants writer feature on schemas with non-null columns but does not itself reject nulls on writes. Kernel now rejects Scalar::Null partition values for partition columns with nullable: false in partitioned_write_context, matching Delta-Spark's DELTA_NOT_NULL_CONSTRAINT_VIOLATED (SQLSTATE 23502).

The check sits in validate_types alongside the existing type checks. The Scalar::Null inner type continues to be ignored on nulls, so connectors can keep constructing null partition values without first looking up the schema; only the schema field's nullability is enforced. This matches Catalyst's NullType permissiveness and the type-erased on-wire format (partitionValues serializes nulls as JSON null with no type information).

## How was this change tested?

Adds integration tests covering all three non-null write paths:

1/ Non-materialized partitioned writes (default): kernel rejects the null Scalar at partitioned_write_context time, before serialization. Covered across all column-mapping modes via the kernel create_table builder.

2/ Materialized partitioned writes (materializePartitionColumns): kernel delegates to the engine's ParquetHandler. The default engine inherits the guarantee from arrow-rs, whose RecordBatch::try_new rejects nulls in fields with nullable: false. The table for this case is set up via the kernel create_table builder with delta.feature.materializePartitionColumns=supported (allow-listed by delta-io#2481).

3/ Plain non-null data columns: same engine-delegation seam, exercised across a representative primitive type matrix (integer, long, string, binary, boolean, timestamp, decimal). The table is created via the kernel create_table builder so the non-null schema drives the auto-enablement of the invariants writer feature end-to-end; the test asserts both the auto-enable and the downstream RecordBatch::try_new rejection. Pins the full chain documented on maybe_enable_invariants in transaction/builder/create_table.rs.

Local verification:

1/ cargo +nightly fmt --check
2/ cargo clippy --workspace --benches --tests --all-features -- -D warnings
3/ cargo doc --workspace --all-features --no-deps
4/ cargo test -p delta_kernel --all-features for affected paths
@sanujbasu sanujbasu force-pushed the materialize-partition-columns-create-table branch from 173d724 to f731408 Compare April 28, 2026 05:56
sanujbasu added a commit to sanujbasu/delta-kernel-rs that referenced this pull request Apr 28, 2026
## What changes are proposed in this pull request?

Fixes delta-io#2465.

Closes the write-time enforcement gap left by delta-io#2418, which auto-enables the
invariants writer feature on schemas with non-null columns but does not itself
reject nulls on writes. Kernel now rejects Scalar::Null partition values for
partition columns with nullable: false in partitioned_write_context, matching
Delta-Spark's DELTA_NOT_NULL_CONSTRAINT_VIOLATED (SQLSTATE 23502).

The check sits in validate_types alongside the existing type checks. The
Scalar::Null inner type continues to be ignored on nulls, so connectors can keep
constructing null partition values without first looking up the schema; only the
schema field's nullability is enforced. This matches Catalyst's NullType
permissiveness and the type-erased on-wire format (partitionValues serializes
nulls as JSON null with no type information).

## How was this change tested?

Adds integration tests covering all three non-null write paths:

1/ Non-materialized partitioned writes (default): kernel rejects the null Scalar
   at partitioned_write_context time, before serialization. Covered across all
   column-mapping modes via the kernel create_table builder.

2/ Materialized partitioned writes (materializePartitionColumns): kernel
   delegates to the engine's ParquetHandler. The default engine inherits the
   guarantee from arrow-rs, whose RecordBatch::try_new rejects nulls in fields
   with nullable: false. The table for this case is set up via the kernel
   create_table builder with delta.feature.materializePartitionColumns=supported
   (allow-listed by delta-io#2481).

3/ Plain non-null data columns: same engine-delegation seam, exercised across a
   representative primitive type matrix (integer, long, string, binary, boolean,
   timestamp, decimal). The table is created via the kernel create_table builder
   so the non-null schema drives the auto-enablement of the invariants writer
   feature end-to-end; the test asserts both the auto-enable and the downstream
   RecordBatch::try_new rejection. Pins the full chain documented on
   maybe_enable_invariants in transaction/builder/create_table.rs.

Local verification:

1/ cargo +nightly fmt --check 2/ cargo clippy --workspace --benches --tests
   --all-features -- -D warnings 3/ cargo doc --workspace --all-features
   --no-deps 4/ cargo test -p delta_kernel --all-features for affected paths
@sanujbasu sanujbasu force-pushed the materialize-partition-columns-create-table branch from f731408 to fe9319d Compare April 28, 2026 06:20
Copy link
Copy Markdown
Collaborator

@scottsand-db scottsand-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good but I think we can re-use an existing test setup?

Comment thread kernel/src/transaction/builder/create_table.rs Outdated
Comment thread kernel/src/transaction/builder/create_table.rs Outdated
Comment thread kernel/tests/create_table/partitioned.rs Outdated
sanujbasu added a commit to sanujbasu/delta-kernel-rs that referenced this pull request Apr 28, 2026
## What changes are proposed in this pull request?

Fixes delta-io#2465.

Closes the write-time enforcement gap left by delta-io#2418, which auto-enables the
invariants writer feature on schemas with non-null columns but does not itself
reject nulls on writes. Kernel now rejects Scalar::Null partition values for
partition columns with nullable: false in partitioned_write_context, matching
Delta-Spark's DELTA_NOT_NULL_CONSTRAINT_VIOLATED (SQLSTATE 23502).

The check sits in validate_types alongside the existing type checks. The
Scalar::Null inner type continues to be ignored on nulls, so connectors can keep
constructing null partition values without first looking up the schema; only the
schema field's nullability is enforced. This matches Catalyst's NullType
permissiveness and the type-erased on-wire format (partitionValues serializes
nulls as JSON null with no type information).

## How was this change tested?

Adds integration tests covering all three non-null write paths:

1/ Non-materialized partitioned writes (default): kernel rejects the null Scalar
   at partitioned_write_context time, before serialization. Covered across all
   column-mapping modes via the kernel create_table builder.

2/ Materialized partitioned writes (materializePartitionColumns): kernel
   delegates to the engine's ParquetHandler. The default engine inherits the
   guarantee from arrow-rs, whose RecordBatch::try_new rejects nulls in fields
   with nullable: false. The table for this case is set up via the kernel
   create_table builder with delta.feature.materializePartitionColumns=supported
   (allow-listed by delta-io#2481).

3/ Plain non-null data columns: same engine-delegation seam, exercised across a
   representative primitive type matrix (integer, long, string, binary, boolean,
   timestamp, decimal). The table is created via the kernel create_table builder
   so the non-null schema drives the auto-enablement of the invariants writer
   feature end-to-end; the test asserts both the auto-enable and the downstream
   RecordBatch::try_new rejection. Pins the full chain documented on
   maybe_enable_invariants in transaction/builder/create_table.rs.

Local verification:

1/ cargo +nightly fmt --check 2/ cargo clippy --workspace --benches --tests
   --all-features -- -D warnings 3/ cargo doc --workspace --all-features
   --no-deps 4/ cargo test -p delta_kernel --all-features for affected paths
@sanujbasu sanujbasu force-pushed the materialize-partition-columns-create-table branch from fe9319d to 34b6516 Compare April 28, 2026 18:22
@sanujbasu
Copy link
Copy Markdown
Collaborator Author

Range-diff: main (fe9319d -> 34b6516)
kernel/src/transaction/builder/create_table.rs
@@ -4,13 +4,10 @@
      // the feature on an all-nullable table so a later ALTER TABLE ADD COLUMN NOT NULL does
      // not need a protocol upgrade.
      TableFeature::Invariants,
-+    // MaterializePartitionColumns keeps partition columns in the data files (rather than
-+    // dropping them via the logical-to-physical transform). The feature is fully wired in
-+    // kernel (write: `Transaction::generate_logical_to_physical` and
-+    // `TableConfiguration::physical_write_schema`; read: scan reconstructs from
-+    // `add.partitionValues` regardless), and `KernelSupport::Supported`. There is no
-+    // `delta.*` enablement property for it, so the only opt-in at create time is the
-+    // explicit feature signal `delta.feature.materializePartitionColumns=supported`.
++    // MaterializePartitionColumns keeps partition columns in the data files instead of
++    // dropping them on write. There is no `delta.*` enablement property; the only opt-in at
++    // create time is the explicit feature signal
++    // `delta.feature.materializePartitionColumns=supported`.
 +    TableFeature::MaterializePartitionColumns,
  ];
  
kernel/tests/create_table/main.rs
@@ -0,0 +1,16 @@
+diff --git a/kernel/tests/create_table/main.rs b/kernel/tests/create_table/main.rs
+--- a/kernel/tests/create_table/main.rs
++++ b/kernel/tests/create_table/main.rs
+ #[case("appendOnly", TableFeature::AppendOnly, false, false)]
+ #[case("changeDataFeed", TableFeature::ChangeDataFeed, false, false)]
+ #[case("rowTracking", TableFeature::RowTracking, false, false)]
++// WriterOnly features (AlwaysIfSupported)
++#[case(
++    "materializePartitionColumns",
++    TableFeature::MaterializePartitionColumns,
++    false,
++    true
++)]
+ fn test_create_table_with_feature_signal(
+     #[case] feature_name: &str,
+     #[case] feature: TableFeature,
\ No newline at end of file
kernel/tests/create_table/partitioned.rs
@@ -8,50 +8,52 @@
  use delta_kernel::transaction::create_table::create_table;
  use delta_kernel::transaction::data_layout::DataLayout;
  use delta_kernel::DeltaResult;
+ use rstest::rstest;
+ use test_utils::test_table_setup;
+ 
+-use super::partition_test_schema;
++use super::{partition_test_schema, simple_schema};
+ 
+ #[rstest]
+ #[case::exact_casing("date")]
  
      Ok(())
  }
 +
-+/// CREATE TABLE with `delta.feature.materializePartitionColumns=supported` lands the feature
-+/// in `writerFeatures` exactly once and not in `readerFeatures` (writer-only feature). The
-+/// feature has no `delta.*` enablement property and is not auto-enabled by any schema or
-+/// metadata, so the explicit feature signal is the only opt-in path at create time.
-+#[test]
-+fn test_create_table_with_materialize_partition_columns_feature_signal_allowed() -> DeltaResult<()>
-+{
++/// CREATE TABLE accepts `delta.feature.materializePartitionColumns=supported` regardless of
++/// whether the table is partitioned. The feature is harmless on a non-partitioned table
++/// (nothing to materialize) and Delta-Spark also does not error in that case, so kernel
++/// matches that behavior. Generic feature-signal landing assertions are covered by
++/// `test_create_table_with_feature_signal` in `main.rs`; this test pins only the
++/// partitioned-vs-non-partitioned acceptance.
++#[rstest]
++#[case::partitioned(true)]
++#[case::non_partitioned(false)]
++fn test_create_table_with_materialize_partition_columns_partitioned_and_not(
++    #[case] partitioned: bool,
++) -> DeltaResult<()> {
 +    let (_temp_dir, table_path, engine) = test_table_setup()?;
-+    let schema = partition_test_schema()?;
++    let schema = if partitioned {
++        partition_test_schema()?
++    } else {
++        simple_schema()?
++    };
 +
-+    let _ = create_table(&table_path, schema, "Test/1.0")
-+        .with_data_layout(DataLayout::partitioned(["date"]))
-+        .with_table_properties([("delta.feature.materializePartitionColumns", "supported")])
++    let mut builder = create_table(&table_path, schema, "Test/1.0")
++        .with_table_properties([("delta.feature.materializePartitionColumns", "supported")]);
++    if partitioned {
++        builder = builder.with_data_layout(DataLayout::partitioned(["date"]));
++    }
++    let _ = builder
 +        .build(engine.as_ref(), Box::new(FileSystemCommitter::new()))?
 +        .commit(engine.as_ref())?;
 +
 +    let snapshot = Snapshot::builder_for(&table_path).build(engine.as_ref())?;
-+    let protocol = snapshot.table_configuration().protocol();
-+    let writer_features = protocol
-+        .writer_features()
-+        .expect("writer features should be present");
-+    let count = writer_features
-+        .iter()
-+        .filter(|f| **f == TableFeature::MaterializePartitionColumns)
-+        .count();
-+    assert_eq!(
-+        count, 1,
-+        "MaterializePartitionColumns should appear exactly once in writerFeatures; got {count}"
-+    );
 +    assert!(
-+        !protocol
-+            .reader_features()
-+            .is_some_and(|f| f.contains(&TableFeature::MaterializePartitionColumns)),
-+        "MaterializePartitionColumns is writer-only and must not appear in readerFeatures"
-+    );
-+    assert!(
 +        snapshot
 +            .table_configuration()
 +            .is_feature_enabled(&TableFeature::MaterializePartitionColumns),
-+        "MaterializePartitionColumns should be enabled when listed in writerFeatures"
++        "MaterializePartitionColumns should be enabled (partitioned={partitioned})"
 +    );
 +
 +    Ok(())

Reproduce locally: git range-diff 8e4c296..fe9319d 8e4c296..34b6516 | Disable: git config gitstack.push-range-diff false

## What changes are proposed in this pull request?

Adds TableFeature::MaterializePartitionColumns to ALLOWED_DELTA_FEATURES.

The feature is already fully understood by kernel: KernelSupport::Supported, the
write path keeps partition columns in data files when the feature is on
(Transaction::generate_logical_to_physical and
TableConfiguration::physical_write_schema), and the read path is correct (scan
reconstructs partition columns from add.partitionValues regardless of whether
they are materialized in the file).

The remaining gap was the create_table builder allow-list, which rejected
delta.feature.materializePartitionColumns=supported at build time.

There is no delta.* enablement property for materializePartitionColumns in
kernel today (MATERIALIZE_PARTITION_COLUMNS_INFO is
EnablementCheck::AlwaysIfSupported), so the only opt-in at create time is the
explicit feature signal.

## How was this change tested?

1/ New unit test
   test_create_table_with_materialize_partition_columns_feature_signal_allowed
   in kernel/tests/create_table/partitioned.rs. asserts that the feature signal
   lands in writerFeatures exactly once, does not appear in readerFeatures
   (writer-only), and is_feature_enabled returns true.

2/ Local verification:

  - cargo +nightly fmt --check
  - cargo clippy --workspace --benches --tests --all-features -- -D warnings
  - cargo doc --workspace --all-features --no-deps
  - cargo test -p delta_kernel --all-features for affected paths
sanujbasu added a commit to sanujbasu/delta-kernel-rs that referenced this pull request Apr 28, 2026
## What changes are proposed in this pull request?

Fixes delta-io#2465.

Closes the write-time enforcement gap left by delta-io#2418, which auto-enables the
invariants writer feature on schemas with non-null columns but does not itself
reject nulls on writes. Kernel now rejects Scalar::Null partition values for
partition columns with nullable: false in partitioned_write_context, matching
Delta-Spark's DELTA_NOT_NULL_CONSTRAINT_VIOLATED (SQLSTATE 23502).

The check sits in validate_types alongside the existing type checks. The
Scalar::Null inner type continues to be ignored on nulls, so connectors can keep
constructing null partition values without first looking up the schema; only the
schema field's nullability is enforced. This matches Catalyst's NullType
permissiveness and the type-erased on-wire format (partitionValues serializes
nulls as JSON null with no type information).

## How was this change tested?

Adds integration tests covering all three non-null write paths:

1/ Non-materialized partitioned writes (default): kernel rejects the null Scalar
   at partitioned_write_context time, before serialization. Covered across all
   column-mapping modes via the kernel create_table builder.

2/ Materialized partitioned writes (materializePartitionColumns): kernel
   delegates to the engine's ParquetHandler. The default engine inherits the
   guarantee from arrow-rs, whose RecordBatch::try_new rejects nulls in fields
   with nullable: false. The table for this case is set up via the kernel
   create_table builder with delta.feature.materializePartitionColumns=supported
   (allow-listed by delta-io#2481).

3/ Plain non-null data columns: same engine-delegation seam, exercised across a
   representative primitive type matrix (integer, long, string, binary, boolean,
   timestamp, decimal). The table is created via the kernel create_table builder
   so the non-null schema drives the auto-enablement of the invariants writer
   feature end-to-end; the test asserts both the auto-enable and the downstream
   RecordBatch::try_new rejection. Pins the full chain documented on
   maybe_enable_invariants in transaction/builder/create_table.rs.

Local verification:

1/ cargo +nightly fmt --check 2/ cargo clippy --workspace --benches --tests
   --all-features -- -D warnings 3/ cargo doc --workspace --all-features
   --no-deps 4/ cargo test -p delta_kernel --all-features for affected paths
@sanujbasu sanujbasu force-pushed the materialize-partition-columns-create-table branch from 34b6516 to 66b5a42 Compare April 28, 2026 18:25
@sanujbasu
Copy link
Copy Markdown
Collaborator Author

Range-diff: main (34b6516 -> 66b5a42)
kernel/tests/create_table/partitioned.rs
@@ -23,9 +23,7 @@
 +/// CREATE TABLE accepts `delta.feature.materializePartitionColumns=supported` regardless of
 +/// whether the table is partitioned. The feature is harmless on a non-partitioned table
 +/// (nothing to materialize) and Delta-Spark also does not error in that case, so kernel
-+/// matches that behavior. Generic feature-signal landing assertions are covered by
-+/// `test_create_table_with_feature_signal` in `main.rs`; this test pins only the
-+/// partitioned-vs-non-partitioned acceptance.
++/// matches that behavior.
 +#[rstest]
 +#[case::partitioned(true)]
 +#[case::non_partitioned(false)]

Reproduce locally: git range-diff 8e4c296..34b6516 8e4c296..66b5a42 | Disable: git config gitstack.push-range-diff false

@sanujbasu sanujbasu requested a review from scottsand-db April 28, 2026 18:25
Copy link
Copy Markdown
Collaborator

@scottsand-db scottsand-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@sanujbasu sanujbasu added this pull request to the merge queue Apr 28, 2026
Merged via the queue into delta-io:main with commit 943234c Apr 28, 2026
24 of 25 checks passed
@sanujbasu sanujbasu deleted the materialize-partition-columns-create-table branch April 28, 2026 21:11
sanujbasu added a commit to sanujbasu/delta-kernel-rs that referenced this pull request Apr 29, 2026
## What changes are proposed in this pull request?

Fixes delta-io#2465.

Closes the write-time enforcement gap left by delta-io#2418, which auto-enables the
invariants writer feature on schemas with non-null columns but does not itself
reject nulls on writes. Kernel now rejects Scalar::Null partition values for
partition columns with nullable: false in partitioned_write_context, matching
Delta-Spark's DELTA_NOT_NULL_CONSTRAINT_VIOLATED (SQLSTATE 23502).

The check sits in validate_types alongside the existing type checks. The
Scalar::Null inner type continues to be ignored on nulls, so connectors can keep
constructing null partition values without first looking up the schema; only the
schema field's nullability is enforced. This matches Catalyst's NullType
permissiveness and the type-erased on-wire format (partitionValues serializes
nulls as JSON null with no type information).

## How was this change tested?

Adds integration tests covering all three non-null write paths:

1/ Non-materialized partitioned writes (default): kernel rejects the null Scalar
   at partitioned_write_context time, before serialization. Covered across all
   column-mapping modes via the kernel create_table builder.

2/ Materialized partitioned writes (materializePartitionColumns): kernel
   delegates to the engine's ParquetHandler. The default engine inherits the
   guarantee from arrow-rs, whose RecordBatch::try_new rejects nulls in fields
   with nullable: false. The table for this case is set up via the kernel
   create_table builder with delta.feature.materializePartitionColumns=supported
   (allow-listed by delta-io#2481).

3/ Plain non-null data columns: same engine-delegation seam, exercised across a
   representative primitive type matrix (integer, long, string, binary, boolean,
   timestamp, decimal). The table is created via the kernel create_table builder
   so the non-null schema drives the auto-enablement of the invariants writer
   feature end-to-end; the test asserts both the auto-enable and the downstream
   RecordBatch::try_new rejection. Pins the full chain documented on
   maybe_enable_invariants in transaction/builder/create_table.rs.

Local verification:

1/ cargo +nightly fmt --check 2/ cargo clippy --workspace --benches --tests
   --all-features -- -D warnings 3/ cargo doc --workspace --all-features
   --no-deps 4/ cargo test -p delta_kernel --all-features for affected paths
sanujbasu added a commit to sanujbasu/delta-kernel-rs that referenced this pull request Apr 29, 2026
## What changes are proposed in this pull request?

Fixes delta-io#2465.

Closes the write-time enforcement gap left by delta-io#2418, which auto-enables the
invariants writer feature on schemas with non-null columns but does not itself
reject nulls on writes. Kernel now rejects Scalar::Null partition values for
partition columns with nullable: false in partitioned_write_context, matching
Delta-Spark's DELTA_NOT_NULL_CONSTRAINT_VIOLATED (SQLSTATE 23502).

The check sits in validate_types alongside the existing type checks. The
Scalar::Null inner type continues to be ignored on nulls, so connectors can keep
constructing null partition values without first looking up the schema; only the
schema field's nullability is enforced. This matches Catalyst's NullType
permissiveness and the type-erased on-wire format (partitionValues serializes
nulls as JSON null with no type information).

## How was this change tested?

Adds integration tests covering all three non-null write paths:

1/ Non-materialized partitioned writes (default): kernel rejects the null Scalar
   at partitioned_write_context time, before serialization. Covered across all
   column-mapping modes via the kernel create_table builder.

2/ Materialized partitioned writes (materializePartitionColumns): kernel
   delegates to the engine's ParquetHandler. The default engine inherits the
   guarantee from arrow-rs, whose RecordBatch::try_new rejects nulls in fields
   with nullable: false. The table for this case is set up via the kernel
   create_table builder with delta.feature.materializePartitionColumns=supported
   (allow-listed by delta-io#2481).

3/ Plain non-null data columns: same engine-delegation seam, exercised across a
   representative primitive type matrix (integer, long, string, binary, boolean,
   timestamp, decimal). The table is created via the kernel create_table builder
   so the non-null schema drives the auto-enablement of the invariants writer
   feature end-to-end; the test asserts both the auto-enable and the downstream
   RecordBatch::try_new rejection. Pins the full chain documented on
   maybe_enable_invariants in transaction/builder/create_table.rs.

Local verification:

1/ cargo +nightly fmt --check 2/ cargo clippy --workspace --benches --tests
   --all-features -- -D warnings 3/ cargo doc --workspace --all-features
   --no-deps 4/ cargo test -p delta_kernel --all-features for affected paths
sanujbasu added a commit to sanujbasu/delta-kernel-rs that referenced this pull request Apr 30, 2026
## What changes are proposed in this pull request?

Fixes delta-io#2465.

Closes the write-time enforcement gap left by delta-io#2418, which auto-enables the
invariants writer feature on schemas with non-null columns but does not itself
reject nulls on writes. Kernel now rejects Scalar::Null partition values for
partition columns with nullable: false in partitioned_write_context, matching
Delta-Spark's DELTA_NOT_NULL_CONSTRAINT_VIOLATED (SQLSTATE 23502).

The check sits in validate_types alongside the existing type checks. The
Scalar::Null inner type continues to be ignored on nulls, so connectors can keep
constructing null partition values without first looking up the schema; only the
schema field's nullability is enforced. This matches Catalyst's NullType
permissiveness and the type-erased on-wire format (partitionValues serializes
nulls as JSON null with no type information).

## How was this change tested?

Adds integration tests covering all three non-null write paths:

1/ Non-materialized partitioned writes (default): kernel rejects the null Scalar
   at partitioned_write_context time, before serialization. Covered across all
   column-mapping modes via the kernel create_table builder.

2/ Materialized partitioned writes (materializePartitionColumns): kernel
   delegates to the engine's ParquetHandler. The default engine inherits the
   guarantee from arrow-rs, whose RecordBatch::try_new rejects nulls in fields
   with nullable: false. The table for this case is set up via the kernel
   create_table builder with delta.feature.materializePartitionColumns=supported
   (allow-listed by delta-io#2481).

3/ Plain non-null data columns: same engine-delegation seam, exercised across a
   representative primitive type matrix (integer, long, string, binary, boolean,
   timestamp, decimal). The table is created via the kernel create_table builder
   so the non-null schema drives the auto-enablement of the invariants writer
   feature end-to-end; the test asserts both the auto-enable and the downstream
   RecordBatch::try_new rejection. Pins the full chain documented on
   maybe_enable_invariants in transaction/builder/create_table.rs.

Local verification:

1/ cargo +nightly fmt --check 2/ cargo clippy --workspace --benches --tests
   --all-features -- -D warnings 3/ cargo doc --workspace --all-features
   --no-deps 4/ cargo test -p delta_kernel --all-features for affected paths
sanujbasu added a commit to sanujbasu/delta-kernel-rs that referenced this pull request May 2, 2026
Fixes delta-io#2465.

Closes the write-time enforcement gap left by delta-io#2418, which auto-enables the
invariants writer feature on schemas with non-null columns but does not itself
reject nulls on writes. Kernel now rejects Scalar::Null partition values for
partition columns with nullable: false in partitioned_write_context, matching
Delta-Spark's DELTA_NOT_NULL_CONSTRAINT_VIOLATED (SQLSTATE 23502).

The check sits in validate_types alongside the existing type checks. The
Scalar::Null inner type continues to be ignored on nulls, so connectors can keep
constructing null partition values without first looking up the schema; only the
schema field's nullability is enforced. This matches Catalyst's NullType
permissiveness and the type-erased on-wire format (partitionValues serializes
nulls as JSON null with no type information).

Adds integration tests covering all three non-null write paths:

1/ Non-materialized partitioned writes (default): kernel rejects the null Scalar
   at partitioned_write_context time, before serialization. Covered across all
   column-mapping modes via the kernel create_table builder.

2/ Materialized partitioned writes (materializePartitionColumns): kernel
   delegates to the engine's ParquetHandler. The default engine inherits the
   guarantee from arrow-rs, whose RecordBatch::try_new rejects nulls in fields
   with nullable: false. The table for this case is set up via the kernel
   create_table builder with delta.feature.materializePartitionColumns=supported
   (allow-listed by delta-io#2481).

3/ Plain non-null data columns: same engine-delegation seam, exercised across a
   representative primitive type matrix (integer, long, string, binary, boolean,
   timestamp, decimal). The table is created via the kernel create_table builder
   so the non-null schema drives the auto-enablement of the invariants writer
   feature end-to-end; the test asserts both the auto-enable and the downstream
   RecordBatch::try_new rejection. Pins the full chain documented on
   maybe_enable_invariants in transaction/builder/create_table.rs.

Local verification:

1/ cargo +nightly fmt --check 2/ cargo clippy --workspace --benches --tests
   --all-features -- -D warnings 3/ cargo doc --workspace --all-features
   --no-deps 4/ cargo test -p delta_kernel --all-features for affected paths
sanujbasu added a commit to sanujbasu/delta-kernel-rs that referenced this pull request May 2, 2026
Fixes delta-io#2465.

Closes the write-time enforcement gap left by delta-io#2418, which auto-enables the
invariants writer feature on schemas with non-null columns but does not itself
reject nulls on writes. Kernel now rejects Scalar::Null partition values for
partition columns with nullable: false in partitioned_write_context, matching
Delta-Spark's DELTA_NOT_NULL_CONSTRAINT_VIOLATED (SQLSTATE 23502).

The check sits in validate_types alongside the existing type checks. The
Scalar::Null inner type continues to be ignored on nulls, so connectors can keep
constructing null partition values without first looking up the schema; only the
schema field's nullability is enforced. This matches Catalyst's NullType
permissiveness and the type-erased on-wire format (partitionValues serializes
nulls as JSON null with no type information).

Adds integration tests covering all three non-null write paths:

1/ Non-materialized partitioned writes (default): kernel rejects the null Scalar
   at partitioned_write_context time, before serialization. Covered across all
   column-mapping modes via the kernel create_table builder.

2/ Materialized partitioned writes (materializePartitionColumns): kernel
   delegates to the engine's ParquetHandler. The default engine inherits the
   guarantee from arrow-rs, whose RecordBatch::try_new rejects nulls in fields
   with nullable: false. The table for this case is set up via the kernel
   create_table builder with delta.feature.materializePartitionColumns=supported
   (allow-listed by delta-io#2481).

3/ Plain non-null data columns: same engine-delegation seam, exercised across a
   representative primitive type matrix (integer, long, string, binary, boolean,
   timestamp, decimal). The table is created via the kernel create_table builder
   so the non-null schema drives the auto-enablement of the invariants writer
   feature end-to-end; the test asserts both the auto-enable and the downstream
   RecordBatch::try_new rejection. Pins the full chain documented on
   maybe_enable_invariants in transaction/builder/create_table.rs.

Local verification:

1/ cargo +nightly fmt --check 2/ cargo clippy --workspace --benches --tests
   --all-features -- -D warnings 3/ cargo doc --workspace --all-features
   --no-deps 4/ cargo test -p delta_kernel --all-features for affected paths
CynicDog pushed a commit to CynicDog/delta-kernel-rs that referenced this pull request May 2, 2026
…2483)

## What changes are proposed in this pull request?

Fixes delta-io#2465.

Closes the write-time enforcement gap left by delta-io#2418, which auto-enables
the invariants writer feature on schemas with non-null columns but does
not itself reject nulls on writes. Kernel now rejects Scalar::Null
partition values for partition columns with nullable: false in
partitioned_write_context, matching Delta-Spark's
DELTA_NOT_NULL_CONSTRAINT_VIOLATED (SQLSTATE 23502).

The check sits in validate_types alongside the existing type checks. The
Scalar::Null inner type continues to be ignored on nulls, so connectors
can keep constructing null partition values without first looking up the
schema; only the schema field's nullability is enforced. This matches
Catalyst's NullType permissiveness and the type-erased on-wire format
(partitionValues serializes nulls as JSON null with no type
information).

## How was this change tested?

Adds integration tests covering all three non-null write paths:

1/ Non-materialized partitioned writes (default): kernel rejects the
null Scalar at partitioned_write_context time, before serialization.
Covered across all column-mapping modes via the kernel create_table
builder.

2/ Materialized partitioned writes (materializePartitionColumns): kernel
delegates to the engine's ParquetHandler. The default engine inherits
the guarantee from arrow-rs, whose RecordBatch::try_new rejects nulls in
fields with nullable: false. The table for this case is set up via the
kernel create_table builder with
delta.feature.materializePartitionColumns=supported (allow-listed by
delta-io#2481).

3/ Plain non-null data columns: same engine-delegation seam, exercised
across a representative primitive type matrix (integer, long, string,
binary, boolean, timestamp, decimal). The table is created via the
kernel create_table builder so the non-null schema drives the
auto-enablement of the invariants writer feature end-to-end; the test
asserts both the auto-enable and the downstream RecordBatch::try_new
rejection. Pins the full chain documented on maybe_enable_invariants in
transaction/builder/create_table.rs.

Local verification:

1/ cargo +nightly fmt --check
2/ cargo clippy --workspace --benches --tests --all-features -- -D
warnings
3/ cargo doc --workspace --all-features --no-deps
4/ cargo test -p delta_kernel --all-features for affected paths

Co-authored-by: Sanuj Basu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants