[Chore](pick) pick changes from PR #61104 and PR #60941#61303
[Chore](pick) pick changes from PR #61104 and PR #60941#61303BiteTheDDDDt wants to merge 5 commits intoapache:branch-4.1from
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
There was a problem hiding this comment.
Pull request overview
This PR cherry-picks changes from #61104 and #60941 to propagate a “single backend query” hint from FE→BE and to optimize string hash-table aggregation hot paths via batched sub-table operations.
Changes:
- Add
single_backend_querytoTQueryOptionsand propagate it through FE planning and BE query context. - Adjust streaming aggregation hash-table expansion heuristics when running on a single backend.
- Add batch emplace/find helpers for string hash tables by grouping rows per sub-table to reduce per-row dispatch overhead.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| gensrc/thrift/PaloInternalService.thrift | Adds single_backend_query to query options for FE→BE propagation. |
| fe/fe-core/src/main/java/org/apache/doris/qe/runtime/ThriftPlansBuilder.java | Sets single_backend_query into TQueryOptions based on coordinator context. |
| fe/fe-core/src/main/java/org/apache/doris/qe/CoordinatorContext.java | Computes whether the query uses a single backend. |
| fe/fe-core/src/main/java/org/apache/doris/qe/Coordinator.java | Propagates single_backend_query into per-fragment pipeline params. |
| be/src/vec/common/hash_table/string_hash_table.h | Exposes submaps + adds visit_submaps() for batch operations. |
| be/src/vec/common/hash_table/hash_map_context.h | Adds row grouping + batch emplace/find helpers for string hash maps. |
| be/src/runtime/query_context.h | Stores is_single_backend_query in BE query context. |
| be/src/runtime/fragment_mgr.cpp | Initializes QueryContext’s is_single_backend_query before prepare(). |
| be/src/pipeline/exec/streaming_aggregation_operator.h | Adds _is_single_backend flag. |
| be/src/pipeline/exec/streaming_aggregation_operator.cpp | Uses different reduction thresholds for single-backend queries + batch emplace. |
| be/src/pipeline/exec/distinct_streaming_aggregation_operator.h | Adds _is_single_backend flag; fixes include formatting. |
| be/src/pipeline/exec/distinct_streaming_aggregation_operator.cpp | Applies single-backend thresholds + batch emplace (void). |
| be/src/pipeline/exec/aggregation_source_operator.cpp | Switches to lazy_emplace_batch() for emplace hot path. |
| be/src/pipeline/exec/aggregation_sink_operator.cpp | Switches to lazy_emplace_batch() / find_batch() for hot paths. |
| be/src/clucene | Updates the clucene subproject commit pointer. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| void prefetch(size_t i) { | ||
| if (LIKELY(i + HASH_MAP_PREFETCH_DIST < hash_values.size())) { | ||
| hash_table->template prefetch<read>(keys[i + HASH_MAP_PREFETCH_DIST], | ||
| hash_values[i + HASH_MAP_PREFETCH_DIST]); | ||
| } | ||
| } | ||
|
|
||
| template <typename State> | ||
| ALWAYS_INLINE auto find(State& state, size_t i) { | ||
| if constexpr (!is_string_hash_map()) { | ||
| prefetch<true>(i); | ||
| } | ||
| auto find(State& state, size_t i) { | ||
| prefetch<true>(i); | ||
| return state.find_key_with_hash(*hash_table, i, keys[i], hash_values[i]); | ||
| } | ||
|
|
||
| template <typename State, typename F, typename FF> | ||
| ALWAYS_INLINE auto lazy_emplace(State& state, size_t i, F&& creator, | ||
| FF&& creator_for_null_key) { | ||
| if constexpr (!is_string_hash_map()) { | ||
| prefetch<false>(i); | ||
| } | ||
| auto lazy_emplace(State& state, size_t i, F&& creator, FF&& creator_for_null_key) { | ||
| prefetch<false>(i); | ||
| return state.lazy_emplace_key(*hash_table, i, keys[i], hash_values[i], creator, | ||
| creator_for_null_key); | ||
| } |
| if constexpr (is_nullable) { | ||
| if (state.key_column->is_null_at(row)) { | ||
| bool has_null_key = hash_table.has_null_key_data(); | ||
| hash_table.has_null_key_data() = true; | ||
| if (!has_null_key) { | ||
| std::forward<FF>(creator_for_null_key)(); | ||
| } | ||
| continue; | ||
| } | ||
| } |
| // Expand into L3 cache if we look like we're getting some reduction. | ||
| // At present, The L2 cache is generally 1024k or more | ||
| {1024 * 1024, 1.1}, | ||
| {.min_ht_mem = 256 * 1024, .streaming_ht_min_reduction = 1.1}, |
| // Expand into L3 cache if we look like we're getting some reduction. | ||
| // At present, The L2 cache is generally 1024k or more | ||
| {.min_ht_mem = 256 * 1024, .streaming_ht_min_reduction = 5.0}, |
| // Expand into L3 cache if we look like we're getting some reduction. | ||
| // At present, The L2 cache is generally 1024k or more | ||
| {.min_ht_mem = 256 * 1024, .streaming_ht_min_reduction = 5.0}, |
|
run buildall |
|
run buildall |
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
FE UT Coverage ReportIncrement line coverage |
pick changes from PR #61104 and PR #60941