Skip to content

[BugFix] Fix shared object mutation in PushDownAggregateRewriter for case-when/if#71309

Open
lxynov wants to merge 2 commits intoStarRocks:mainfrom
lxynov:bugfix-aggpd
Open

[BugFix] Fix shared object mutation in PushDownAggregateRewriter for case-when/if#71309
lxynov wants to merge 2 commits intoStarRocks:mainfrom
lxynov:bugfix-aggpd

Conversation

@lxynov
Copy link
Copy Markdown
Contributor

@lxynov lxynov commented Apr 5, 2026

Why I'm doing:

When pushing down an aggregate whose argument is a nested case-when/if, PushDownAggregateCollector didn't add the inner condition columns to the group-by list, causing a query failure.

The issue can be reproduced by the below query:

WITH cte1 AS (
  SELECT
    t.t1d AS fk,
    t.t1a AS cat,
    CASE WHEN t.t1b = 1 THEN t.t1e ELSE t.t1f END AS cval
  FROM test_all_type t
),
cte2 AS (
  SELECT a.cval, a.fk, a.cat
  FROM cte1 a
  LEFT JOIN t1 ON a.fk = t1.v4
),
cte3 AS (
  SELECT CASE WHEN c.cat THEN c.cval ELSE NULL END gval, c.fk
  FROM cte2 c
)
SELECT SUM(gval)
FROM cte3
GROUP BY fk;

It fails with

com.starrocks.sql.common.StarRocksPlannerException: only found column statistics: {1: t1a, 2: t1b, 4: t1d}, but missing statistic of col: 19: sum.

What I'm doing:

Issue Analysis

The above SQL produces nested IF projections:

aggregations={16: sum=sum(15: case)}
15: case->if(cast(1: t1a as boolean), 11: case, null)
11: case->if(2: t1b = 1, cast(5: t1e as double), 6: t1f)

When the collector visits the LogicalProjectOperator with 15: case, it rewrites the aggregation to

16: sum -> sum(if(cast(1: t1a as boolean), 11: case, null))

then it correctly adds t1a to the group-by columns.

Later, when the collector visits LogicalProjectOperator with 11: case, it rewrites the aggregation to

sum(if(false, if(2: t1b = 1, cast(5: t1e as double), 6: t1f), null))

But this time, it doesn't see t1b in the "condition" columns, so it doesn't add t1b into the group-by columns.

This behavior is different than the rewriter's behavior.

When the collector and the rewriter see different group-by lists, the pushdown fails. However, the rewriter works in a top-down manner ---- it rewrites nodes above first, then it aborts on the table scan ---- this produces an incorrect plan.

Fix

Modify the collector to add inner case-when/if's condition columns into the group-by list as well.

For the above query, this is the aggregate pushdown's behavior today:

Before the pushdown:

LOGICAL
->  LogicalProjectOperator {projection=[16: sum->16: sum]}
    ->  LogicalAggregation {type=GLOBAL ,aggregations={16: sum=sum(15: case)} ,groupKeys=[4: t1d], partitionBys=[4: t1d]}
        ->  LogicalProjectOperator {projection=[4: t1d->4: t1d, 15: case->if(cast(1: t1a as boolean), 11: case, null)]}
            ->  LOGICAL_JOIN {LEFT OUTER JOIN, onPredicate = 4: t1d = 12: v4 , Predicate = null}
                ->  LogicalProjectOperator {projection=[1: t1a->1: t1a, 4: t1d->4: t1d, 11: case->if(2: t1b = 1, cast(5: t1e as double), 6: t1f)]}
                    ->  LogicalOlapScanOperator {table=test_all_type, selectedPartitionId=[10136], selectedIndexMetaId=10138, outputColumns=[1: t1a, 2: t1b, 4: t1d, 5: t1e, 6: t1f], prunedPartitionPredicates=[], limit=-1}
                ->  LogicalProjectOperator {projection=[12: v4->12: v4]}
                    ->  LogicalOlapScanOperator {table=t1, selectedPartitionId=[10012], selectedIndexMetaId=10014, outputColumns=[12: v4], prunedPartitionPredicates=[], limit=-1}

After the pushdown:

LOGICAL
->  LogicalProjectOperator {projection=[16: sum->16: sum]}
    ->  LogicalAggregation {type=GLOBAL ,aggregations={16: sum=sum(17: sum)} ,groupKeys=[4: t1d], partitionBys=[4: t1d]}
        ->  LogicalProjectOperator {projection=[17: sum->if(cast(1: t1a as boolean), 18: sum, null), 18: sum->18: sum, 4: t1d->4: t1d, 15: case->if(cast(1: t1a as boolean), 18: sum, null)]}
            ->  LOGICAL_JOIN {LEFT OUTER JOIN, onPredicate = 4: t1d = 12: v4 , Predicate = null}
                ->  LogicalProjectOperator {projection=[1: t1a->1: t1a, 18: sum->if(2: t1b = 1, 19: sum, 20: sum), 11: case->if(2: t1b = 1, 19: sum, 20: sum), 19: sum->19: sum, 4: t1d->4: t1d, 20: sum->20: sum]}
                    ->  LogicalAggregation {type=GLOBAL ,aggregations={19: sum=sum(21: cast), 20: sum=sum(6: t1f)} ,groupKeys=[1: t1a, 2: t1b, 4: t1d], partitionBys=[1: t1a, 2: t1b, 4: t1d]}
                        ->  LogicalProjectOperator {projection=[1: t1a->1: t1a, 2: t1b->2: t1b, 4: t1d->4: t1d, 5: t1e->5: t1e, 21: cast->cast(5: t1e as double), 6: t1f->6: t1f]}
                            ->  LogicalOlapScanOperator {table=test_all_type, selectedPartitionId=[10136], selectedIndexMetaId=10138, outputColumns=[1: t1a, 2: t1b, 4: t1d, 5: t1e, 6: t1f], prunedPartitionPredicates=[], limit=-1}
                ->  LogicalProjectOperator {projection=[12: v4->12: v4]}
                    ->  LogicalOlapScanOperator {table=t1, selectedPartitionId=[10012], selectedIndexMetaId=10014, outputColumns=[12: v4], prunedPartitionPredicates=[], limit=-1}

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
    • This pr needs auto generate documentation
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 4.1
    • 4.0
    • 3.5
    • 3.4

@lxynov lxynov requested a review from a team as a code owner April 5, 2026 21:38
@mergify mergify bot assigned lxynov Apr 5, 2026
@mergify mergify bot assigned lxynov Apr 6, 2026
@CelerData-Reviewer
Copy link
Copy Markdown

@codex review

When pushing down an aggregate whose argument is a nested case-when/if,
the previous code failed to add the inner condition columns to group-bys.
@stdpain
Copy link
Copy Markdown
Contributor

stdpain commented Apr 6, 2026

mysql> SELECT     SUM(sub.cval), MIN(sub.cval), sub.fk FROM (     SELECT t1d AS fk,            IF(t1b = 1, t1e, NULL) AS cval     FROM test_all_type ) sub JOIN t0 ON sub.fk = t0.c1 GROUP BY sub.fk;
ERROR 1064 (HY000): only found column statistics: {2: t1b, 4: t1d, 5: t1e}, but missing statistic of col: 35: sum.

@CelerData-Reviewer
Copy link
Copy Markdown

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: da35ed69fa

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@stdpain stdpain self-assigned this Apr 6, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 6, 2026

[Java-Extensions Incremental Coverage Report]

pass : 0 / 0 (0%)

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 6, 2026

[FE Incremental Coverage Report]

pass : 32 / 32 (100.00%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 com/starrocks/sql/optimizer/rule/tree/pdagg/PushDownAggregateCollector.java 30 30 100.00% []
🔵 com/starrocks/sql/optimizer/rule/tree/pdagg/PushDownAggregateRewriter.java 2 2 100.00% []

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 6, 2026

[BE Incremental Coverage Report]

pass : 0 / 0 (0%)

@lxynov
Copy link
Copy Markdown
Contributor Author

lxynov commented Apr 6, 2026

Thanks @stdpain for adding another fix.

It's fixing a separate bug. Would recommend creating another commit when merging or changing the squashed commit's message.

@lxynov
Copy link
Copy Markdown
Contributor Author

lxynov commented Apr 6, 2026

After this PR is merged, could you help backport it to branch-3.5-cc? Thank you! (we don't have the permission to call @Mergifyio)

@stdpain stdpain changed the title [BugFix] Fix aggregate pushdown with nested case-when/ifs [BugFix] Fix shared object mutation in PushDownAggregateRewriter for case-when/if Apr 7, 2026
@stdpain
Copy link
Copy Markdown
Contributor

stdpain commented Apr 7, 2026

https://github.com/Mergifyio backport branch-3.5-cc

@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Apr 7, 2026

backport branch-3.5-cc

🟠 Waiting for conditions to match

Details
  • merged [📌 backport requirement]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants