Skip to content

[feat](spill) Support spill repartition#61088

Open
mrhhsg wants to merge 1 commit intoapache:masterfrom
mrhhsg:spill_repatition
Open

[feat](spill) Support spill repartition#61088
mrhhsg wants to merge 1 commit intoapache:masterfrom
mrhhsg:spill_repatition

Conversation

@mrhhsg
Copy link
Member

@mrhhsg mrhhsg commented Mar 5, 2026

[c7b56dbe6a5][2026-03-05][Hu Shenggang] fix ut
[b3a486d14b9][2026-03-05][Hu Shenggang] tiny mod
[6411b66][2026-03-05][yiguolei ] refactor code and add unit test
[23854d8][2026-03-05][yiguolei ] add force spill logic in join sink operator
[a631dac][2026-03-04][yiguolei ] update dir and file meta realtime
[a3fd36e][2026-03-04][yiguolei ] fix agg profile bug
[c0380ad][2026-03-04][yiguolei ] change spill file to shared ptr
[05670c5][2026-03-04][yiguolei ] fix probe revokeable memory size bug
[5595d90][2026-03-03][yiguolei ] fix compile bug
[36e3004][2026-03-03][yiguolei ] simplify code
[9094b6b][2026-03-03][yiguolei ] simplify agg code
[fdb355f][2026-03-03][yiguolei ] simplify probe code enhancement probe operator
[82e94bb][2026-03-03][yiguolei ] refactor spill file interface
[7548fe8][2026-03-02][Hu Shenggang] some tiny fix
[2329442][2026-03-01][Hu Shenggang] fix agg revocable mem size
[2c23a07][2026-02-28][Hu Shenggang] disbale distinct streaming agg when spill enabled
[5c27350][2026-02-28][Hu Shenggang] Using spill_buffer_size_bytes as read limit when recovering data
[35f2c55][2026-02-28][Hu Shenggang] [pipeline] Proactively pause query for spill under memory pressure in PipelineTask
[4df2277][2026-02-28][Hu Shenggang] Make spill stream RAII
[6a99170][2026-02-28][Hu Shenggang] Clear revoked data in agg & avoid null pointer in join
[995aeec][2026-02-27][Hu Shenggang] Make repartitioner level-aware
[87cbf87][2026-02-27][yiguolei ] fix compile
[8a250cd][2026-02-27][yiguolei ] fix compile
[a92b150][2026-02-27][yiguolei ] remove some codes
[37df1fd][2026-02-27][yiguolei ] f
[dd6d1f0][2026-02-27][yiguolei ] f
[eb8a827][2026-02-27][yiguolei ] f
[f73c0e9][2026-02-27][yiguolei ] f
[2c989cb][2026-02-27][yiguolei ] repartitioner

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

[c7b56dbe6a5][2026-03-05][Hu Shenggang] fix ut
[b3a486d14b9][2026-03-05][Hu Shenggang] tiny mod
[6411b66][2026-03-05][yiguolei    ] refactor code and add unit test
[23854d8][2026-03-05][yiguolei    ] add force spill logic in join sink operator
[a631dac][2026-03-04][yiguolei    ] update dir and file meta realtime
[a3fd36e][2026-03-04][yiguolei    ] fix agg profile bug
[c0380ad][2026-03-04][yiguolei    ] change spill file to shared ptr
[05670c5][2026-03-04][yiguolei    ] fix probe revokeable memory size bug
[5595d90][2026-03-03][yiguolei    ] fix compile bug
[36e3004][2026-03-03][yiguolei    ] simplify code
[9094b6b][2026-03-03][yiguolei    ] simplify agg code
[fdb355f][2026-03-03][yiguolei    ] simplify probe code enhancement probe operator
[82e94bb][2026-03-03][yiguolei    ] refactor spill file interface
[7548fe8][2026-03-02][Hu Shenggang] some tiny fix
[2329442][2026-03-01][Hu Shenggang] fix agg revocable mem size
[2c23a07][2026-02-28][Hu Shenggang] disbale distinct streaming agg when spill enabled
[5c27350][2026-02-28][Hu Shenggang] Using spill_buffer_size_bytes as read limit when recovering data
[35f2c55][2026-02-28][Hu Shenggang] [pipeline] Proactively pause query for spill under memory pressure in PipelineTask
[4df2277][2026-02-28][Hu Shenggang] Make spill stream RAII
[6a99170][2026-02-28][Hu Shenggang] Clear revoked data in agg & avoid null pointer in join
[995aeec][2026-02-27][Hu Shenggang] Make repartitioner level-aware
[87cbf87][2026-02-27][yiguolei    ] fix compile
[8a250cd][2026-02-27][yiguolei    ] fix compile
[a92b150][2026-02-27][yiguolei    ] remove some codes
[37df1fd][2026-02-27][yiguolei    ] f
[dd6d1f0][2026-02-27][yiguolei    ] f
[eb8a827][2026-02-27][yiguolei    ] f
[f73c0e9][2026-02-27][yiguolei    ] f
[2c989cb][2026-02-27][yiguolei    ] repartitioner
@Thearas
Copy link
Contributor

Thearas commented Mar 5, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@mrhhsg
Copy link
Member Author

mrhhsg commented Mar 5, 2026

run buildall

@doris-robot
Copy link

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 79.35% (1798/2266)
Line Coverage 64.54% (32213/49910)
Region Coverage 65.44% (16132/24653)
Branch Coverage 55.89% (8585/15360)

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 100.00% (12/12) 🎉
Increment coverage report
Complete coverage report

@doris-robot
Copy link

TPC-H: Total hot run time: 27821 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit a97f7c8b2b4b4addbad0972e35db117d739d5ba9, data reload: false

------ Round 1 ----------------------------------
============================================
q1	17660	4545	4339	4339
q2	q3	10723	801	525	525
q4	4723	369	255	255
q5	8138	1207	1041	1041
q6	220	172	145	145
q7	813	870	672	672
q8	10701	1509	1326	1326
q9	6800	4768	4748	4748
q10	6668	1929	1682	1682
q11	444	270	256	256
q12	779	568	464	464
q13	18133	2931	2209	2209
q14	238	231	216	216
q15	926	796	826	796
q16	778	730	679	679
q17	718	896	400	400
q18	6108	5463	5288	5288
q19	1160	981	590	590
q20	499	488	384	384
q21	4839	1953	1543	1543
q22	394	312	263	263
Total cold run time: 101462 ms
Total hot run time: 27821 ms

----- Round 2, with runtime_filter_mode=off -----
============================================
q1	4743	4572	4598	4572
q2	q3	3929	4334	3795	3795
q4	859	1188	777	777
q5	4112	4413	4374	4374
q6	178	173	148	148
q7	1795	1677	1578	1578
q8	2627	2738	2594	2594
q9	7505	7468	7399	7399
q10	3728	4020	3631	3631
q11	524	439	420	420
q12	488	605	470	470
q13	2850	3377	2354	2354
q14	301	312	278	278
q15	905	820	808	808
q16	715	752	719	719
q17	1186	1437	1387	1387
q18	7245	6915	6736	6736
q19	886	869	902	869
q20	2102	2221	2019	2019
q21	4018	3474	3347	3347
q22	445	423	380	380
Total cold run time: 51141 ms
Total hot run time: 48655 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 153786 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit a97f7c8b2b4b4addbad0972e35db117d739d5ba9, data reload: false

query5	4322	654	522	522
query6	326	223	214	214
query7	4225	470	278	278
query8	349	245	223	223
query9	8737	2765	2733	2733
query10	506	382	334	334
query11	7237	5790	5667	5667
query12	186	131	130	130
query13	1265	472	359	359
query14	5738	3764	3561	3561
query14_1	2830	2819	2838	2819
query15	209	197	176	176
query16	982	465	466	465
query17	906	729	632	632
query18	2454	464	358	358
query19	226	225	182	182
query20	137	132	131	131
query21	222	147	128	128
query22	4963	5113	4924	4924
query23	16657	16077	15791	15791
query23_1	16161	15988	15932	15932
query24	7759	1705	1279	1279
query24_1	1326	1291	1286	1286
query25	772	495	423	423
query26	1281	280	155	155
query27	2934	535	312	312
query28	4528	1903	1905	1903
query29	858	587	475	475
query30	315	250	214	214
query31	1369	1307	1220	1220
query32	88	69	70	69
query33	508	369	276	276
query34	933	929	576	576
query35	633	674	597	597
query36	1112	1125	989	989
query37	128	95	87	87
query38	2992	2940	2898	2898
query39	956	867	846	846
query39_1	822	834	822	822
query40	228	154	133	133
query41	63	59	59	59
query42	301	303	297	297
query43	238	243	228	228
query44	
query45	211	187	186	186
query46	875	996	619	619
query47	2159	2193	2065	2065
query48	315	326	236	236
query49	627	451	383	383
query50	684	276	211	211
query51	4210	4187	4088	4088
query52	284	298	282	282
query53	298	345	280	280
query54	305	279	265	265
query55	95	88	85	85
query56	310	324	307	307
query57	1363	1334	1320	1320
query58	292	292	272	272
query59	1361	1441	1294	1294
query60	350	337	307	307
query61	147	146	144	144
query62	627	589	540	540
query63	304	279	272	272
query64	5162	1299	1031	1031
query65	
query66	1479	481	342	342
query67	16374	16398	16249	16249
query68	
query69	384	303	312	303
query70	991	983	941	941
query71	358	307	296	296
query72	2812	2668	2459	2459
query73	538	547	322	322
query74	9999	9856	9786	9786
query75	2865	2723	2453	2453
query76	2296	1023	656	656
query77	348	380	299	299
query78	11156	11321	10689	10689
query79	2695	775	622	622
query80	1742	626	539	539
query81	560	277	249	249
query82	1024	153	121	121
query83	342	284	240	240
query84	301	114	94	94
query85	974	563	521	521
query86	424	309	300	300
query87	3158	3118	2985	2985
query88	3593	2733	2738	2733
query89	424	378	348	348
query90	2009	177	170	170
query91	165	160	132	132
query92	76	71	78	71
query93	1080	826	532	532
query94	635	310	296	296
query95	594	328	382	328
query96	635	512	232	232
query97	2469	2510	2436	2436
query98	238	221	218	218
query99	1019	1006	936	936
Total cold run time: 237013 ms
Total hot run time: 153786 ms

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (12/12) 🎉
Increment coverage report
Complete coverage report

@mrhhsg mrhhsg changed the title Support spill repartition [feat](spill) Support spill repartition Mar 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants