-
Notifications
You must be signed in to change notification settings - Fork 277
Fix hour/minute/second handling for TimestampNTZ #3265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix hour/minute/second handling for TimestampNTZ #3265
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Aligns Comet’s hour/minute/second behavior with Spark semantics by avoiding session-timezone conversion for TimestampNTZ (timestamp without timezone), while preserving the existing conversion behavior for timezone-aware timestamps.
Changes:
- Bypass timezone conversion for
DataType::Timestamp(_, None)(TimestampNTZ) inputs. - Preserve timezone conversion for
DataType::Timestamp(_, Some(_))inputs viaarray_with_timezone. - Add an explicit execution error for unsupported (non-timestamp) input types.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let array = match array.data_type() { | ||
| // TimestampNTZ → DO NOT apply timezone conversion | ||
| DataType::Timestamp(_, None) => array.clone(), | ||
|
|
||
| // Timestamp with timezone → convert from UTC to session timezone | ||
| DataType::Timestamp(_, Some(_)) => array_with_timezone( | ||
| array, |
Copilot
AI
Jan 24, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
invoke_with_args now pattern-matches directly on array.data_type() and rejects DataType::Dictionary(_, _) with an execution error. This is a regression for dictionary-encoded timestamp columns (the function’s return_type explicitly supports dictionary input, and array_with_timezone also has dictionary handling). Add explicit handling for DataType::Dictionary(_, value_type) where value_type is Timestamp(_, None) (no conversion) or Timestamp(_, Some(_)) (apply array_with_timezone), instead of falling into the error branch.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3265 +/- ##
============================================
+ Coverage 56.12% 60.07% +3.94%
- Complexity 976 1426 +450
============================================
Files 119 172 +53
Lines 11743 15926 +4183
Branches 2251 2631 +380
============================================
+ Hits 6591 9567 +2976
- Misses 4012 5031 +1019
- Partials 1140 1328 +188 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
86ba72d to
c5cf290
Compare
Which issue does this PR close?
Closes #3180
Rationale for this change
Spark
TimestampNTZvalues represent local time without any timezoneinformation. However, the current Comet implementation applies timezone
conversion when evaluating
hour,minute, andsecond, which leads toincorrect results for
TimestampNTZinputs in non-UTC session timezones.This change aligns Comet behavior with Spark semantics by ensuring that
timezone conversion is only applied to
Timestampvalues with an explicittimezone, and not to
TimestampNTZ.What changes are included in this PR?
extract_date_partused byhour,minute, andsecondto:TimestampNTZinputsTimestampinputs with timezoneHow are these changes tested?
cargo test -p datafusion-comet-spark-expr