Skip to content

Use log-scaled quantile sketch budgets and rank-based accuracy checks#12129

Open
RAMitchell wants to merge 10 commits intodmlc:masterfrom
RAMitchell:cpu-distributed-quantile-logn-budget
Open

Use log-scaled quantile sketch budgets and rank-based accuracy checks#12129
RAMitchell wants to merge 10 commits intodmlc:masterfrom
RAMitchell:cpu-distributed-quantile-logn-budget

Conversation

@RAMitchell
Copy link
Copy Markdown
Member

@RAMitchell RAMitchell commented Mar 25, 2026

Summary

This PR aligns quantile sketch sizing more closely with the single-machine algorithm and updates the test suite to validate rank-error guarantees instead of cut-value deltas.

The main functional change is on the CPU distributed sketch path: we now track the number of represented elements per feature, serialize those counts through the distributed sketch payload, and recompute SketchSummaryBudget(...) after merge/prune using the summed per-feature counts. This changes the distributed CPU merge budget from a fixed O(1 / eps) cap to the same O(log n / eps) budget shape used by the underlying sketch.

In addition, this PR cleans up related sizing paths and strengthens quantile accuracy coverage across C++ and Python.

What This Changes

  • track represented element counts in WQuantileSketch
  • serialize per-feature element counts in the CPU distributed sketch payload
  • recompute the CPU distributed merge/prune budget from summed per-feature counts using SketchSummaryBudget(...)
  • use the same summary-budget helper in the CPU sorted-column ingestion path
  • preserve exact weighted values in the sorted sketch when the budget can retain every unique value
  • deduplicate sketch-budget logic by routing related CPU/GPU helper paths through the shared budget helpers

Test Changes

  • replace CPU distributed cut-to-cut comparisons with rank-error validation
  • add sparse row-split distributed tests where per-feature counts vary across both features and workers
  • add deterministic sorted weighted exact-cut coverage
  • align local GPU quantile tests with the same rank-based validation contract and shared weighted tolerance
  • add shared Python rank-error validation helpers and use them in QuantileDMatrix / quantile-cut tests

Testing

Ran locally:

  • ./build-cpu/testxgboost --gtest_filter='Quantile.*:HistUtil.*'
  • ./build-cuda-local/testxgboost --gtest_filter='HistUtil.*:GPUQuantile.*'
  • pytest tests/python/test_data_iterator.py tests/python/test_quantile_dmatrix.py tests/python/test_updaters.py -k "test_data_iterator or test_training or test_ref_quantile_cut or test_get_quantile_cut"

Notes

This PR is no longer limited to CPU distributed merge/prune only. It now includes:

  • the CPU distributed log n / eps budget plumbing
  • the sorted weighted exact-summary fix
  • shared rank-based validation updates across C++, GPU coverage, and Python

@RAMitchell RAMitchell changed the title [WIP] Increase CPU distributed quantile sketch budget to O(log n / eps) Use log-scaled quantile sketch budgets and rank-based accuracy checks Apr 1, 2026
@RAMitchell RAMitchell marked this pull request as ready for review April 1, 2026 11:39
@RAMitchell RAMitchell requested a review from Copilot April 1, 2026 11:39
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates quantile sketch budgeting to follow the same O(log n / eps) summary-size behavior as the single-machine sketch (including distributed CPU merge/prune), and refreshes test coverage to validate the rank-error contract instead of comparing cut values directly.

Changes:

  • Track per-feature represented element counts in WQuantileSketch, serialize them in the distributed CPU sketch allreduce payload, and recompute merge/prune budgets from those counts.
  • Route multiple CPU/GPU sketch sizing paths through shared budget helpers (SketchSummaryBudget), including the GPU intermediate prune target.
  • Replace/extend C++ and Python tests to use rank-based cut validation (plus exact-cut coverage when the budget can retain all unique values).

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/common/quantile.h Adds element-count tracking to WQuantileSketch and an exact-summary fast path for sorted weighted input.
src/common/quantile.cc Extends distributed sketch payload to include element counts and uses SketchSummaryBudget during merge/prune and sorted ingestion.
src/common/quantile.cu Uses SketchSummaryBudget for GPU intermediate pruning instead of a local helper.
src/common/quantile.cuh Removes IntermediateNumCuts() helper (now replaced by shared budget helper usage).
src/common/hist_util.cu Switches sample-cut sizing to SketchSummaryBudget.
tests/cpp/common/test_hist_util.h Tightens/aligns rank-error thresholds, updates exact-value validation, and adds a weight-aware validation wrapper.
tests/cpp/common/test_hist_util.cu Uses the new weight-aware validation wrapper for GPU sketch tests.
tests/cpp/common/test_hist_util.cc Adjusts rank-error validation for weighted CPU cases and adds a sorted weighted exact-cut regression test.
tests/cpp/common/test_quantile.cc Reworks distributed CPU quantile tests to validate rank error (row/column split + sparse count skew).
tests/cpp/common/test_quantile.cu Aligns distributed GPU weighted tolerance usage with the shared weighted threshold.
python-package/xgboost/testing/quantile_dmatrix.py Adds shared Python rank-error validation helpers and uses them in reference-cut checks.
python-package/xgboost/testing/updater.py Adds rank-error assertions for get_quantile_cut device tests (numerical case).
tests/python/test_data_iterator.py Replaces local rank-error helper with shared Python helper.
tests/python/test_quantile_dmatrix.py Adds rank-error assertions for iterator-vs-array quantile cuts in training test.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@RAMitchell RAMitchell requested a review from trivialfis April 1, 2026 14:11
Comment on lines +14 to +15
MAX_NORMALIZED_RANK_ERROR = 2.0
MAX_WEIGHTED_NORMALIZED_RANK_ERROR = 14.0
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please provide some brief comments on utilities here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants