Using NERSC's Perlmutter HPC #5513
romanlee
started this conversation in
High performance computing
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Overview
NERSC's Perlmutter supercomputer is located at Lawrence Berkeley National Lab in Berkeley, California.
Perlmutter is a HPE (Hewlett Packard Enterprise) Cray EX supercomputer consisting of 1792 GPU accelerated nodes with 1 AMD EPYC 7763 processor and 4 NVIDIA A100 GPUs, 448~TB main and 328~TB GPU memory, and 3072 CPU-only nodes with 2 processors, all connected by Cray HPE Slingshot-11.
Perlmutter uses the Slurm Workload manager for batch job submission.
[Note, this post is subject to change. Let's try to keep it up to date, please comment below if something does not work.]
Scope
This discussion can cover anything to do with trying to get results from running Oceananigans on Perlmutter --- including installing Julia, setting up CUDA and MPI, configuring Slurm batch submission scripts, and using other Julia packages in conjunction with Oceananigans.
Links
Getting started on Perlmutter
It's assumed as prerequisite that you have access to Perlmutter.
The first task is to download Julia. See the section of the NERSC docs on Julia for details.
Submit a multi-node GPU job via Slurm
Refer to the NERSC docs for the basics of running jobs on Perlmutter.
In what follows we will describe how to launch a simple, 2 node, 8 GPU simulation which exercises the CUDA-aware MPI implementation in Oceananigans.
First, create a script that will exercise the CUDA-aware MPI implementation, call it
hello-cuda-mpi.jl:Next, create a submission script, e.g. named
job.shthat contains the following:The
moduleandexportcommands loadcray-mpich(the recommended MPI implementation to use on Perlmutter), thecuda-toolkit, and generally ensure your environment is set up properly.For running your own simulations beyond this toy problem, the Perlmutter jobscript generator is a convenient resource for determining the correct
#SBATCHandsrunflags. However, the module loads and environmental variable exports should remain the same.Now, launch the job!
But wait! You probably encountered the following message printing on an infinite loop:
This is due to a bug which currently exists in the Cray MPICH implementation whereby, on multi-node jobs launched with
srun, a malformed environment entry gets inserted after a call toMPI_Init.CUDA.jlis sensitive to this malformed entry, and thus it breaks ourhello-cuda-mpi.jlsimulation.Until this is fixed, we have the following workaround.
First, create a file called
sanitize_environ.jlthat contains:Then, make the following changes to
hello-cuda-mpi.jl. First, afterusing CUDAadd the following lineSecond, after
arch = Distributed(GPU())(which internally callsMPI_Init) add the following lines:Now, relaunch the job from the command line with
sbatch job.shand you should get the correct output:Et voilà! You now have a CUDA-aware MPI Oceananigans configuration!! 🎉
Beta Was this translation helpful? Give feedback.
All reactions