Skip to content

feat(multi_stage): add median measure type with PERCENTILE_CONT support#10736

Draft
TheRoot-1 wants to merge 1 commit intocube-js:masterfrom
TheRoot-1:feat/multi-stage-median
Draft

feat(multi_stage): add median measure type with PERCENTILE_CONT support#10736
TheRoot-1 wants to merge 1 commit intocube-js:masterfrom
TheRoot-1:feat/multi-stage-median

Conversation

@TheRoot-1
Copy link
Copy Markdown

Summary

Adds median as a supported multi_stage measure type, rendered as an ordered-set aggregate (PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY ...)). This is the first half of #10560dense_rank is not yet implemented and will come as a follow-up (or separate PR) depending on review feedback.

Without this, users hit a schema validation error (median is not a valid multiStageMeasureType) and have no runtime-filter-respecting way to compute medians — they're forced into static subqueries that ignore filters.

What's included

Schema compiler (JS planner):

  • New median type in multiStageMeasureType validator
  • SQL rendering path in BaseQuery.js that emits PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY <inner_measure>) against the correct CTE reference
  • Unit tests for validator + SQL generation
  • Postgres integration test

Tesseract (Rust planner):

  • New MultiStageMedianNode sql node
  • Registration in SqlNodesFactory
  • Measure-kind plumbing in measure_kinds/mod.rs
  • Logical + physical plan support (calculation.rs, multi_stage_measure_calculation.rs)
  • Member/query planner integration
  • 5 integration tests covering basic, group_by, dimension, filter, no-granularity paths
  • Unit test on measure symbol properties

Example

measures:
  - name: _inner_order_count
    sql: order_id
    type: count_distinct
    public: false

  - name: median_orders_per_customer
    sql: "{_inner_order_count}"
    type: median
    multi_stage: true
    group_by: []

Generated SQL (Postgres):

WITH cte_0 AS (
  SELECT count(distinct order_id) "cube___inner_order_count" FROM ...
)
SELECT PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY "cube___inner_order_count") FROM cte_0

Test plan

  • cubejs-schema-compiler unit tests pass (test/unit/base-query.test.ts, test/unit/cube-validator.test.ts)
  • cubesqlplanner full suite: 876 passed, 0 failed
  • Postgres integration tests (requires running DB — I've written the test at test/integration/postgres/multi-stage.test.ts but haven't run it locally)
  • Maintainer sanity check on the SQL shape before widening scope to dense_rank

Notes

Closes part of #10560

Adds `median` as a valid multi_stage measure type, rendering as
`PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY <sql>)` in both the
JS schema-compiler path and the Tesseract native SQL planner.

JS changes:
- CubeValidator: add 'median' to multiStageMeasureType allowlist
- BaseQuery: render PERCENTILE_CONT in multi_stage and general aggregation paths

Tesseract changes:
- MeasureKind::Median variant with full dependency/SQL wiring
- MultiStageMedianNode SQL node for PERCENTILE_CONT rendering
- MultiStageInodeMemberType::Median planner variant
- MultiStageCalculationWindowFunction::Median physical plan variant

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions github-actions Bot added rust Pull requests that update Rust code javascript Pull requests that update Javascript code pr:community Contribution from Cube.js community members. labels Apr 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

javascript Pull requests that update Javascript code pr:community Contribution from Cube.js community members. rust Pull requests that update Rust code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant