bench: add remote_spawn benchmark for inject queue contention by alex · Pull Request #7944 · tokio-rs/tokio

alex · 2026-02-28T18:02:00Z

Add a benchmark that measures contention on the scheduler's inject queue mutex (push_remote_task) when multiple external threads spawn tasks into the runtime simultaneously. Every rt.spawn() from a non-worker thread unconditionally goes through push_remote_task, so this directly measures the scalability of the inject queue.

Results on an M1 Max MacBook Pro (10 cores), spawning 12,800 total tasks:

threads/1: 3.39 ms (265 ns/task, 1.00x)
threads/2: 4.74 ms (370 ns/task, 1.40x)
threads/4: 5.89 ms (460 ns/task, 1.74x)
threads/8: 8.10 ms (633 ns/task, 2.39x)

Wall-clock time increases with more threads despite constant total work, confirming the single mutex serializes producers.

Motivation

Solution

Add a benchmark that measures contention on the scheduler's inject queue mutex (push_remote_task) when multiple external threads spawn tasks into the runtime simultaneously. Every rt.spawn() from a non-worker thread unconditionally goes through push_remote_task, so this directly measures the scalability of the inject queue. Results on an M1 Max MacBook Pro (10 cores), spawning 12,800 total tasks: threads/1: 3.39 ms (265 ns/task, 1.00x) threads/2: 4.74 ms (370 ns/task, 1.40x) threads/4: 5.89 ms (460 ns/task, 1.74x) threads/8: 8.10 ms (633 ns/task, 2.39x) Wall-clock time increases with more threads despite constant total work, confirming the single mutex serializes producers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

alex · 2026-02-28T18:12:31Z

I also tried a version of this that had remote threads pushing onto an mpsc unbounded queue -- it displayed much worse scaling behavior, but it wasn't clear how much of that was about the mpcs queue vs. the injector queue.

martin-g

By using .iter_custom() you could measure only the time spent in spawning and ignore the time awaiting handles and cleaning up.

diff --git i/benches/remote_spawn.rs w/benches/remote_spawn.rs
index 16cfa0b9..d0aa4741 100644
--- i/benches/remote_spawn.rs
+++ w/benches/remote_spawn.rs
@@ -32,38 +32,46 @@ fn remote_spawn_contention(c: &mut Criterion) {
             |b, &num_threads| {
                 let rt = rt();
                 let tasks_per_thread = TOTAL_TASKS / num_threads;
+                let barrier = Barrier::new(num_threads);

-                b.iter(|| {
-                    let barrier = Barrier::new(num_threads);
-
-                    std::thread::scope(|s| {
-                        let handles: Vec<_> = (0..num_threads)
-                            .map(|_| {
-                                let barrier = &barrier;
-                                let rt = &rt;
-                                s.spawn(move || {
-                                    let mut join_handles = Vec::with_capacity(tasks_per_thread);
-                                    barrier.wait();
-
-                                    for _ in 0..tasks_per_thread {
-                                        join_handles.push(rt.spawn(async {}));
-                                    }
-                                    join_handles
+                b.iter_custom(|iters| {
+                    let mut total_duration = std::time::Duration::ZERO;
+                    for _ in 0..iters {
+
+                        let start = std::time::Instant::now();
+
+                        let all_handles = std::thread::scope(|s| {
+                            let handles: Vec<_> = (0..num_threads)
+                                .map(|_| {
+                                    let barrier = &barrier;
+                                    let rt = &rt;
+                                    s.spawn(move || {
+                                        let mut join_handles = Vec::with_capacity(tasks_per_thread);
+                                        barrier.wait();
+
+                                        for _ in 0..tasks_per_thread {
+                                            join_handles.push(rt.spawn(async {}));
+                                        }
+                                        join_handles
+                                    })
                                 })
-                            })
-                            .collect();
+                                .collect();
+
+                            handles
+                                .into_iter()
+                                .flat_map(|h| h.join().unwrap())
+                                .collect::<Vec<_>>()
+                        });

-                        let all_handles: Vec<_> = handles
-                            .into_iter()
-                            .flat_map(|h| h.join().unwrap())
-                            .collect();
+                        total_duration += start.elapsed();

                         rt.block_on(async {
                             for h in all_handles {
                                 h.await.unwrap();
                             }
                         });
-                    });
+                    }
+                    total_duration
                 });
             },
         );
@@ -85,7 +93,6 @@ fn parallelism_levels() -> Vec<usize> {

 fn rt() -> Runtime {
     runtime::Builder::new_multi_thread()
-        .enable_all()
         .build()
         .unwrap()
 }

benches/remote_spawn.rs

alex · 2026-03-03T13:16:54Z

Thanks, will implement this feedback later this afternoon!

Do you agree that this is measuring the right thing? (That's my biggest concern, that I might accidentally be measuring the wrong contention!)

alex · 2026-03-04T00:03:38Z

Feedback incorporated, thanks!

Use iter_custom to time only the spawn phase (push_remote_task contention), excluding the await/cleanup of join handles. Also remove enable_all() since the benchmark doesn't need IO or timers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

martin-g · 2026-03-04T07:18:53Z

Do you agree that this is measuring the right thing?

I think so, yes!

alex · 2026-03-06T20:18:23Z

FWIW, once this lands I've got an implementation of a sharded queue (similar to #7757) that improves scalability considerably (tested up to 64 cores).

alex · 2026-03-10T22:28:17Z

Is anything else needed to merge this? Happy to take any work.

alex force-pushed the bench-remote-spawn-contention branch from 8449476 to ce47f1a Compare February 28, 2026 18:04

martin-g reviewed Mar 3, 2026

View reviewed changes

benches/remote_spawn.rs Outdated Show resolved Hide resolved

benches/remote_spawn.rs Show resolved Hide resolved

benches/remote_spawn.rs Outdated Show resolved Hide resolved

alex force-pushed the bench-remote-spawn-contention branch from b9d5eb0 to 073e72d Compare March 4, 2026 00:03

martin-g approved these changes Mar 7, 2026

View reviewed changes

martin-g added I-slow Problems and improvements related to performance. A-benches Area: Benchmarks labels Mar 7, 2026

Darksonn merged commit ee04abe into tokio-rs:master Mar 11, 2026
102 checks passed

alex deleted the bench-remote-spawn-contention branch March 11, 2026 12:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bench: add remote_spawn benchmark for inject queue contention#7944

bench: add remote_spawn benchmark for inject queue contention#7944
Darksonn merged 2 commits intotokio-rs:masterfrom
alex:bench-remote-spawn-contention

alex commented Feb 28, 2026

Uh oh!

alex commented Feb 28, 2026

Uh oh!

martin-g left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alex commented Mar 3, 2026

Uh oh!

alex commented Mar 4, 2026

Uh oh!

martin-g commented Mar 4, 2026

Uh oh!

alex commented Mar 6, 2026

Uh oh!

alex commented Mar 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

alex commented Feb 28, 2026

Motivation

Solution

Uh oh!

alex commented Feb 28, 2026

Uh oh!

martin-g left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alex commented Mar 3, 2026

Uh oh!

alex commented Mar 4, 2026

Uh oh!

martin-g commented Mar 4, 2026

Uh oh!

alex commented Mar 6, 2026

Uh oh!

alex commented Mar 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants