[jvm-packages] ExtenralMemory: Overlap the caching time by wbo4958 · Pull Request #11474 · dmlc/xgboost

wbo4958 · 2025-05-21T03:54:28Z

This PR tries to overlap the caching time (caching cudf table to the disk when enabling external memory).

The previous design is

cache the cudf table to the disk
delivery the cudf table to the xgboost to build QuantileDMatrix incrementally

So, they are running in sequence which is time-consuming.

This PR moves the caching process to run in a separate thread, which can be overlapped when building QuantileDMatrix.

I did some test locally with total 13.1G dataset,

With this PR, it's going to eliminate the 90%+ caching time.

trivialfis · 2025-05-21T16:25:56Z

...ckages/xgboost4j-spark-gpu/src/main/scala/ml/dmlc/xgboost4j/scala/spark/ExternalMemory.scala

-      count += 1
-      Thread.sleep(50)
+    // If timeout, it's going to throw exception
+    val success = Await.result(futureOpt.get, 6.seconds)


6 seconds is not a lot when data is large and system is busy, why do we need timeout here? Can writing to disk hang?

wbo4958 added 3 commits May 21, 2025 11:47

[jvm-packages] ExtenralMemory: Overlap the caching time

b81b082

fix small file

6008ef3

fix

795e151

trivialfis reviewed May 21, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[jvm-packages] ExtenralMemory: Overlap the caching time#11474

[jvm-packages] ExtenralMemory: Overlap the caching time#11474
wbo4958 wants to merge 3 commits intodmlc:masterfrom
wbo4958:external-mem-overlap

wbo4958 commented May 21, 2025

Uh oh!

trivialfis May 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

wbo4958 commented May 21, 2025

Uh oh!

trivialfis May 21, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants