Skip to content

benchmark for arena2 vs boa gc#35

Open
shruti2522 wants to merge 2 commits intoboa-dev:mainfrom
shruti2522:bench-arena2
Open

benchmark for arena2 vs boa gc#35
shruti2522 wants to merge 2 commits intoboa-dev:mainfrom
shruti2522:bench-arena2

Conversation

@shruti2522
Copy link
Contributor

@shruti2522 shruti2522 commented Mar 6, 2026

fix #29

  • Implemented MarkSweepGarbageCollector with arena2 in mark_sweep_arena2
  • added the benchmark arena2_vs_boa_gc, documented the results in notes
  • fixed some errors occuring in mark_sweep tests

TODO once approved : plug this implementation into boa

bench results:

arena2 is much faster for simple allocations and collection sweeps, about 2x fast. In mixed tests and heavy memory tests, they perform about the same

shruti@DESKTOP-QR46EJI:~/oscars$ cd oscars
cargo bench --bench arena2_vs_boa_gc
warning: /home/shruti/oscars/oscars_derive/Cargo.toml: no edition set: defaulting to the 2015 edition while the latest is 2024
]
    Finished `bench` profile [optimized] target(s) in 1.76s
     Running benches/arena2_vs_boa_gc.rs (/home/shruti/oscars/target/release/deps/arena2_vs_boa_gc-22212f5ed455f0de)
Gnuplot not found, using plotters backend
Benchmarking gc_node_allocation/arena2/10: Collecting 100 samples in est
gc_node_allocation/arena2/10
                        time:   [315.14 ns 323.99 ns 333.74 ns]
                        change: [-8.5410% -4.0484% +0.7687%] (p = 0.10 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) high mild
  4 (4.00%) high severe
Benchmarking gc_node_allocation/boa_gc/10: Collecting 100 samples in est
gc_node_allocation/boa_gc/10
                        time:   [725.43 ns 753.55 ns 783.40 ns]
                        change: [-7.0067% +0.1724% +7.7033%] (p = 0.96 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) low mild
  4 (4.00%) high mild
Benchmarking gc_node_allocation/arena2/100: Collecting 100 samples in es
gc_node_allocation/arena2/100
                        time:   [3.0243 µs 3.2124 µs 3.4581 µs]
                        change: [+1.6103% +7.3988% +13.516%] (p = 0.02 < 0.05)
                        Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe
Benchmarking gc_node_allocation/boa_gc/100: Collecting 100 samples in es
gc_node_allocation/boa_gc/100
                        time:   [6.2184 µs 6.4585 µs 6.6943 µs]
                        change: [+0.2528% +6.8615% +14.500%] (p = 0.05 > 0.05)
                        No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
  6 (6.00%) low mild
  2 (2.00%) high mild
Benchmarking gc_node_allocation/arena2/1000: Collecting 100 samples in e
gc_node_allocation/arena2/1000
                        time:   [25.822 µs 27.342 µs 29.206 µs]
                        change: [-9.6207% -3.9000% +2.4630%] (p = 0.21 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  5 (5.00%) high mild
  2 (2.00%) high severe
Benchmarking gc_node_allocation/boa_gc/1000: Collecting 100 samples in e
gc_node_allocation/boa_gc/1000
                        time:   [54.173 µs 56.229 µs 58.288 µs]
                        change: [-7.9614% -0.3138% +8.0131%] (p = 0.94 > 0.05)
                        No change in performance detected.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

Benchmarking gc_collection_pause/arena2/100: Collecting 100 samples in e
gc_collection_pause/arena2/100
                        time:   [3.4743 µs 3.5647 µs 3.6666 µs]
                        change: [-16.868% -12.899% -8.9361%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild
Benchmarking gc_collection_pause/boa_gc/100: Collecting 100 samples in e
gc_collection_pause/boa_gc/100
                        time:   [7.1347 µs 7.3703 µs 7.6789 µs]
                        change: [-9.2123% -3.6024% +1.7062%] (p = 0.23 > 0.05)
                        No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe
Benchmarking gc_collection_pause/arena2/500: Collecting 100 samples in e
gc_collection_pause/arena2/500
                        time:   [14.763 µs 15.263 µs 15.831 µs]
                        change: [+0.5742% +5.1749% +9.6980%] (p = 0.03 < 0.05)
                        Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe
Benchmarking gc_collection_pause/boa_gc/500: Collecting 100 samples in e
gc_collection_pause/boa_gc/500
                        time:   [31.634 µs 32.509 µs 33.384 µs]
                        change: [+0.2625% +2.9273% +5.9544%] (p = 0.05 < 0.05)
                        Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
Benchmarking gc_collection_pause/arena2/1000: Collecting 100 samples in 
gc_collection_pause/arena2/1000
                        time:   [28.546 µs 29.589 µs 30.729 µs]
                        change: [-3.2893% +1.5700% +6.1712%] (p = 0.53 > 0.05)
                        No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe
Benchmarking gc_collection_pause/boa_gc/1000: Collecting 100 samples in 
gc_collection_pause/boa_gc/1000
                        time:   [71.514 µs 74.927 µs 78.739 µs]
                        change: [+13.519% +18.927% +24.575%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

Benchmarking mixed_workload/arena2: Collecting 100 samples in estimated 
mixed_workload/arena2   time:   [16.873 µs 17.851 µs 19.037 µs]
                        change: [-7.3270% -2.1537% +3.2701%] (p = 0.45 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe
Benchmarking mixed_workload/boa_gc: Collecting 100 samples in estimated 
mixed_workload/boa_gc   time:   [17.066 µs 17.820 µs 18.771 µs]
                        change: [+10.531% +17.515% +24.242%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
  4 (4.00%) high mild
  3 (3.00%) high severe

Benchmarking memory_pressure/arena2: Collecting 100 samples in estimated
memory_pressure/arena2  time:   [44.222 µs 45.998 µs 48.040 µs]
                        change: [+7.8436% +13.040% +18.345%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) high mild
  1 (1.00%) high severe
Benchmarking memory_pressure/boa_gc: Collecting 100 samples in estimated
memory_pressure/boa_gc  time:   [44.497 µs 46.599 µs 49.070 µs]
                        change: [+23.411% +29.573% +36.275%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
5 (5.00%) high mild

@shruti2522 shruti2522 changed the title benchmark for arena2 vs boa gc benchmark for arena2 vs boa gc (WIP) Mar 6, 2026
@shruti2522 shruti2522 changed the title benchmark for arena2 vs boa gc (WIP) benchmark for arena2 vs boa gc Mar 6, 2026
@shruti2522 shruti2522 marked this pull request as ready for review March 6, 2026 13:07
Copy link
Member

@nekevss nekevss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmmm, for the sake of quick iteration, we should split these into two modules, but can you attempt to minimize the duplication here, thinking specifically about the tracing and cell module.

This would mean splitting off the pointers, the internals, and the collector impl into versioned modules. But then we lessen the duplication.

We will also need to eventually think about a way to reuse some of this code maybe behind a trait. But let's leave that for later and focus on minimizing the duplication while having two implementations.

// SAFETY: `ArenaHeapItem` is `repr(transparent)`, use addr_of_mut! to avoid
// creating a &mut reference during trace
let raw: *mut ArenaHeapItem<GcBox<NonTraceable>> = self.as_heap_ptr().as_ptr();
unsafe { NonNull::new_unchecked(core::ptr::addr_of_mut!((*raw).0)) }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use &raw mut instead per the docs

let raw: *mut ArenaHeapItem<GcBox<NonTraceable>> = self.as_heap_ptr().as_ptr();
// SAFETY: `raw` is non-null because it comes from `as_heap_ptr()`
// `ArenaHeapItem` is `#[repr(transparent)]` so it shares the same address as field 0
unsafe { NonNull::new_unchecked(core::ptr::addr_of_mut!((*raw).0)) }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use &raw mut instead

pub(crate) type ErasedEphemeron = core::ptr::NonNull<
ArenaHeapItem<
Ephemeron<
crate::collectors::mark_sweep_arena2::internals::NonTraceable,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: weird formatting

Maybe just import NonTraceable directly and don't indent this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this needs to be copied again. Is there a reason the other cell intrinsics can't be shared between the two approaches here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed duplication in cell.rs, trace.rs and gc_header.rs now just re-export the mark_sweep ones directly since they share the exact same types. Also added a comment about the trace macros so it's clear in the future

@shruti2522 shruti2522 force-pushed the bench-arena2 branch 2 times, most recently from 2d884ef to d0686d1 Compare March 7, 2026 13:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Benchmarking MarkSweepGarbageCollector in Boa with arena2

2 participants