Skip to content

Make benchmarks compatible with both CUDA.jl v5 and v6#5532

Merged
giordano merged 2 commits intomainfrom
mg/benchmarking-cuda-6
Apr 21, 2026
Merged

Make benchmarks compatible with both CUDA.jl v5 and v6#5532
giordano merged 2 commits intomainfrom
mg/benchmarking-cuda-6

Conversation

@giordano
Copy link
Copy Markdown
Collaborator

Alternative to #5531 (but #5531 could also be merged first anyway as a stop-gap solution), but requires actual testing

@giordano giordano added the benchmark performance runs preconfigured benchamarks and spits out timing label Apr 21, 2026
@giordano giordano marked this pull request as draft April 21, 2026 09:13
@giordano giordano marked this pull request as ready for review April 21, 2026 09:13
@giordano
Copy link
Copy Markdown
Collaborator Author

Seems to be working so far 🥳

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 21, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 68.65%. Comparing base (a3fcdc5) to head (25837bd).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5532      +/-   ##
==========================================
+ Coverage   68.60%   68.65%   +0.05%     
==========================================
  Files         403      403              
  Lines       22551    22552       +1     
==========================================
+ Hits        15471    15484      +13     
+ Misses       7080     7068      -12     
Flag Coverage Δ
buildkite 68.65% <ø> (+0.05%) ⬆️
julia 68.65% <ø> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions
Copy link
Copy Markdown
Contributor

Benchmark Comparison

Benchmark Comparison: PR vs Main

Benchmark PR (pts/s) Main (pts/s) Change
EarthOcean_tripolar_360x180x50_F64_WENOVectorInvariantDefault_WENO7_CATKE_2tr 42462672.532 42507194.439 -0.1%
EarthOcean_tripolar_180x90x50_F64_WENOVectorInvariantDefault_WENO7_CATKE_2tr 23867681.322 23187741.992 +2.9%
EarthOcean_tripolar_720x360x50_F64_WENOVectorInvariantDefault_WENO7_CATKE_2tr 43393810.160 43394928.404 -0%
EarthOcean_tripolar_360x180x50_F32_WENOVectorInvariantDefault_WENO7_CATKE_2tr 53165527.237 54098197.903 -1.7%
EarthOcean_tripolar_360x180x50_F64_WENOVectorInvariantDefault_WENO7_nothing_2tr 68462891.386 68349667.676 +0.2%
EarthOcean_tripolar_360x180x50_F64_WENOVectorInvariantDefault_WENO7_CATKE+Biharmonic_2tr 31262769.336 31286934.757 -0.1%
EarthOcean_tripolar_360x180x50_F64_WENOVectorInvariantDefault_WENO7_CATKE+GM+Biharmonic_2tr 11336474.645 11330203.392 +0.1%
EarthOcean_tripolar_360x180x50_F64_nothing_nothing_CATKE_2tr 87082612.641 87143693.341 -0.1%
EarthOcean_tripolar_360x180x50_F64_WENOVectorInvariant5_WENO5_CATKE_2tr 53894615.806 53890657.284 +0%
EarthOcean_tripolar_360x180x50_F64_WENOVectorInvariant9_WENO9_CATKE_2tr 29405834.654 29194698.310 +0.7%
EarthOcean_lat_lon_zstar_360x180x50_F64_WENOVectorInvariantDefault_WENO7_CATKE_2tr 33365530.579 33393935.169 -0.1%
EarthOcean_immersed_lat_lon_zstar_360x180x50_F64_WENOVectorInvariantDefault_WENO7_CATKE_2tr 38974161.447 39042114.348 -0.2%
EarthOcean_tripolar_zstar_360x180x50_F64_WENOVectorInvariantDefault_WENO7_CATKE_2tr 39522824.404 39654469.507 -0.3%
EarthOcean_lat_lon_360x180x50_F64_WENOVectorInvariantDefault_WENO7_CATKE_2tr 39033975.079 39004599.152 +0.1%
EarthOcean_immersed_lat_lon_360x180x50_F64_WENOVectorInvariantDefault_WENO7_CATKE_2tr 42728819.581 42838196.174 -0.3%
EarthOcean_tripolar_360x180x50_F64_WENOVectorInvariantDefault_WENO7_CATKE_3tr 38487172.640 38492387.926 -0%

NSYS Kernel Profiling

EarthOcean_tripolar_360x180x50_F64_WENOVectorInvariantDefault_WENO7_CATKE_2tr

Kernel Median (ms) Main (ms) Change Instances Avg (ms) Min (ms) Max (ms)
gpu_compute_hydrostatic_free_surface_Gu_ 3.738 3.734 +0.1% 318 3.726 2.590 3.875
gpu_compute_hydrostatic_free_surface_Gv_ 3.700 3.688 +0.3% 318 3.695 2.553 3.983
gpu_compute_hydrostatic_free_surface_Gc_ 2.337 2.302 +1.5% 315 2.344 2.302 2.651
gpu_compute_hydrostatic_free_surface_Gc_ 2.314 2.302 +0.5% 315 2.321 2.276 2.645
gpu_compute_hydrostatic_free_surface_Gc_ 2.307 2.302 +0.2% 315 2.314 2.279 2.610
gpu__rk_substep_turbulent_kinetic_energy_ 1.992 1.989 +0.1% 315 1.996 1.987 2.176
gpu_compute_CATKE_closure_fields_ 1.567 1.565 +0.1% 318 1.571 1.557 1.705
gpu__compute_w_from_continuity_ 0.341 0.340 +0.3% 633 0.341 0.337 0.347
gpu_compute_TKE_diffusivity_ 0.643 0.642 +0.2% 315 0.645 0.637 0.694
gpu__compute_split_explicit_transport_velocities_ 0.491 0.482 +1.9% 315 0.491 0.488 0.497

@giordano giordano merged commit 3fd9c79 into main Apr 21, 2026
64 checks passed
@giordano giordano deleted the mg/benchmarking-cuda-6 branch April 21, 2026 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

benchmark performance runs preconfigured benchamarks and spits out timing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants