rust: Reduce size of some huge common objects: Plan, PlanNode, Expr<Raw>#35292
Closed
def- wants to merge 3 commits intoMaterializeInc:mainfrom
Closed
rust: Reduce size of some huge common objects: Plan, PlanNode, Expr<Raw>#35292def- wants to merge 3 commits intoMaterializeInc:mainfrom
def- wants to merge 3 commits intoMaterializeInc:mainfrom
Conversation
|
Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone. PR title guidelines
Pre-merge checklist
|
9bae068 to
4ac0e5c
Compare
Box `Statement<Raw>` in DeclarePlan/PreparePlan (eliminating the ~832-byte AST inline), then box 17 Plan variants larger than 200 bytes. SelectPlan (184 bytes) is kept unboxed as the hot-path variant for every SELECT query. The Plan enum represents every SQL statement type. Before this change, even a simple `SELECT 1` allocated ~1888 bytes on the stack for the Plan. After: 184 bytes — a 90% reduction. DDL/EXPLAIN/SUBSCRIBE operations pay one extra heap allocation, negligible compared to their inherent catalog transaction cost. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Box 9 large inline fields in PlanNode<T> (GetPlan, JoinPlan, KeyValPlan, ReducePlan, TopKPlan, and MapFilterProject in 4 positions). These are constructed once during plan lowering and read during dataflow rendering, so the extra indirection is negligible. PlanNode is the LIR stored per-dataflow on every compute worker, persisting in memory for the lifetime of each dataflow. The 73% size reduction (384→104 bytes) reduces per-node memory across all active dataflows. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…riants
Two complementary optimizations that shrink the core AST Expr enum:
1. Expr::Function(Function<T>) → Function(Box<Function<T>>).
Function<Raw> is 240 bytes and was the single largest variant,
inflating the entire enum. Boxing reduces it to 8 bytes inline.
2. Expr::Cast { data_type: T::DataType } → { data_type: Box<T::DataType> }.
RawDataType can be up to ~48 bytes; boxing reduces to 8 bytes.
Expr<Raw> is the most numerous AST node type — every expression in
every SQL query is represented as one. The savings compound because
Expr is stored recursively: every Vec<Expr<T>> element and every
Box<Expr<T>> allocation saves 168 bytes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
4ac0e5c to
a5310ae
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Extracted and cleaned up from #35255