Skip to content

Commit 23a16d0

Browse files
committed
restore many guides, uses of Summit. Mention that its decomissioned.
1 parent 6a77280 commit 23a16d0

File tree

8 files changed

+220
-3
lines changed

8 files changed

+220
-3
lines changed

docs/advanced_installation.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,10 @@ Further recommendations for selected HPC systems are given in the
4949

5050
MPICC=mpiicc pip install mpi4py --no-binary mpi4py
5151

52+
On Summit, the following line is recommended (with gcc compilers)::
53+
54+
CC=mpicc MPICC=mpicc pip install mpi4py --no-binary mpi4py
55+
5256
.. tab-item:: conda
5357

5458
Install libEnsemble with Conda_ from the conda-forge channel::
@@ -112,6 +116,12 @@ Further recommendations for selected HPC systems are given in the
112116

113117
spack info py-libensemble
114118

119+
On some platforms you may wish to run libEnsemble without ``mpi4py``,
120+
using a serial PETSc build. This is often preferable if running on
121+
the launch nodes of a three-tier system (e.g., Summit)::
122+
123+
spack install py-libensemble +scipy +mpmath +petsc4py ^py-petsc4py~mpi ^petsc~mpi~hdf5~hypre~superlu-dist
124+
115125
The installation will create modules for libEnsemble and the dependent
116126
packages. These can be loaded by running::
117127

docs/known_issues.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,8 @@ may occur when using libEnsemble.
1919
* Local comms mode (multiprocessing) may fail if MPI is initialized before
2020
forking processors. This is thought to be responsible for issues combining
2121
multiprocessing with PETSc on some platforms.
22+
* Remote detection of logical cores via ``LSB_HOSTS`` (e.g., Summit) returns the
23+
number of physical cores as SMT info not available.
2224
* TCP mode does not support
2325
(1) more than one libEnsemble call in a given script or
2426
(2) the auto-resources option to the Executor.

docs/nitpicky

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ py:class libensemble.resources.platforms.Perlmutter
4646
py:class libensemble.resources.platforms.PerlmutterCPU
4747
py:class libensemble.resources.platforms.PerlmutterGPU
4848
py:class libensemble.resources.platforms.Polaris
49-
py:class libensemble.resources.platforms.Sunspot
49+
py:class libensemble.resources.platforms.Summit
5050
py:class libensemble.resources.rset_resources.RSetResources
5151
py:class libensemble.resources.env_resources.EnvResources
5252
py:class libensemble.resources.resources.Resources

docs/platforms/example_scripts.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,3 +28,9 @@ information about the respective systems and configuration.
2828
.. literalinclude:: ../../examples/libE_submission_scripts/bebop_submit_slurm_distrib.sh
2929
:caption: /examples/libE_submission_scripts/bebop_submit_slurm_distrib.sh
3030
:language: bash
31+
32+
.. dropdown:: Summit (Decomissioned) - On Launch Nodes with Multiprocessing
33+
34+
.. literalinclude:: ../../examples/libE_submission_scripts/summit_submit_mproc.sh
35+
:caption: /examples/libE_submission_scripts/summit_submit_mproc.sh
36+
:language: bash

docs/platforms/platforms_index.rst

Lines changed: 24 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,26 @@ per worker, and adding the manager onto the first node.
8080
HPC systems that only allow one application to be launched to a node at any one time,
8181
will not allow a distributed configuration.
8282

83+
Systems with Launch/MOM Nodes
84+
-----------------------------
85+
86+
Some large systems have a 3-tier node setup. That is, they have a separate set of launch nodes
87+
(known as MOM nodes on Cray Systems). User batch jobs or interactive sessions run on a launch node.
88+
Most such systems supply a special MPI runner that has some application-level scheduling
89+
capability (e.g., ``aprun``, ``jsrun``). MPI applications can only be submitted from these nodes. Examples
90+
of these systems include Summit and Sierra.
91+
92+
There are two ways of running libEnsemble on these kinds of systems. The first, and simplest,
93+
is to run libEnsemble on the launch nodes. This is often sufficient if the worker's simulation
94+
or generation functions are not doing much work (other than launching applications). This approach
95+
is inherently centralized. The entire node allocation is available for the worker-launched
96+
tasks.
97+
98+
However, running libEnsemble on the compute nodes is potentially more scalable and
99+
will better manage simulation and generation functions that contain considerable
100+
computational work or I/O. Therefore the second option is to use proxy task-execution
101+
services like Balsam_.
102+
83103
Balsam - Externally Managed Applications
84104
----------------------------------------
85105

@@ -190,11 +210,13 @@ libEnsemble on specific HPC systems.
190210
:titlesonly:
191211

192212
aurora
213+
bebop
193214
frontier
215+
improv
194216
perlmutter
195217
polaris
196-
bebop
197-
improv
218+
spock_crusher
219+
summit
198220
srun
199221
example_scripts
200222

docs/platforms/summit.rst

Lines changed: 158 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,158 @@
1+
======================
2+
Summit (Decomissioned)
3+
======================
4+
5+
Summit_ was an IBM AC922 system located at the Oak Ridge Leadership Computing
6+
Facility (OLCF). Each of the approximately 4,600 compute nodes on Summit contained two
7+
IBM POWER9 processors and six NVIDIA Volta V100 accelerators.
8+
9+
Summit featured three tiers of nodes: login, launch, and compute nodes.
10+
11+
Users on login nodes submit batch runs to the launch nodes.
12+
Batch scripts and interactive sessions run on the launch nodes. Only the launch
13+
nodes can submit MPI runs to the compute nodes via ``jsrun``.
14+
15+
These docs are maintained to guide libEnsemble's usage on three-tier systems similar to Summit.
16+
17+
Special note on resource sets and Executor submit options
18+
---------------------------------------------------------
19+
20+
When using the portable MPI run configuration options (e.g., num_nodes) to the
21+
:doc:`MPIExecutor<../executor/mpi_executor>` ``submit`` function, it is important
22+
to note that, due to the `resource sets`_ used on Summit, the options refer to
23+
resource sets as follows:
24+
25+
- num_procs (int, optional) – The total number resource sets for this run.
26+
27+
- num_nodes (int, optional) – The number of nodes on which to submit the run.
28+
29+
- procs_per_node (int, optional) – The number of resource sets per node.
30+
31+
It is recommended that the user defines a resource set as the minimal configuration
32+
of CPU cores/processes and GPUs. These can be added to the ``extra_args`` option
33+
of the *submit* function. Alternatively, the portable options can be ignored and
34+
everything expressed in ``extra_args``.
35+
36+
For example, the following *jsrun* line would run three resource sets,
37+
each having one core (with one process), and one GPU, along with some extra options::
38+
39+
jsrun -n 3 -a 1 -g 1 -c 1 --bind=packed:1 --smpiargs="-gpu"
40+
41+
To express this line in the ``submit`` function may look
42+
something like the following::
43+
44+
exctr = Executor.executor
45+
task = exctr.submit(app_name="mycode",
46+
num_procs=3,
47+
extra_args="-a 1 -g 1 -c 1 --bind=packed:1 --smpiargs="-gpu""
48+
app_args="-i input")
49+
50+
This would be equivalent to::
51+
52+
exctr = Executor.executor
53+
task = exctr.submit(app_name="mycode",
54+
extra_args="-n 3 -a 1 -g 1 -c 1 --bind=packed:1 --smpiargs="-gpu""
55+
app_args="-i input")
56+
57+
The libEnsemble resource manager works out the resources available to each worker,
58+
but unlike some other systems, ``jsrun`` on Summit dynamically schedules runs to
59+
available slots across and within nodes. It can also queue tasks. This allows variable
60+
size runs to easily be handled on Summit. If oversubscription to the `jsrun` system
61+
is desired, then libEnsemble's resource manager can be disabled in the
62+
calling script via::
63+
64+
libE_specs["disable_resource_manager"] = True
65+
66+
In the above example, the task being submitted used three GPUs, which is half those
67+
available on a Summit node, and thus two such tasks may be allocated to each node
68+
(from different workers), if they were running at the same time.
69+
70+
Job Submission
71+
--------------
72+
73+
Summit used LSF_ for job management and submission. For libEnsemble, the most
74+
important command is ``bsub`` for submitting batch scripts from the login nodes
75+
to execute on the launch nodes.
76+
77+
It is recommended to run libEnsemble on the launch nodes (assuming workers are
78+
submitting MPI applications) using the ``local`` communications mode (multiprocessing).
79+
80+
Interactive Runs
81+
^^^^^^^^^^^^^^^^
82+
83+
You can run interactively with ``bsub`` by specifying the ``-Is`` flag,
84+
similarly to the following::
85+
86+
$ bsub -W 30 -P [project] -nnodes 8 -Is
87+
88+
This will place you on a launch node.
89+
90+
.. note::
91+
You will need to reactivate your conda virtual environment.
92+
93+
Batch Runs
94+
^^^^^^^^^^
95+
96+
Batch scripts specify run settings using ``#BSUB`` statements. The following
97+
simple example depicts configuring and launching libEnsemble to a launch node with
98+
multiprocessing. This script also assumes the user is using the ``parse_args()``
99+
convenience function from libEnsemble's :doc:`tools module<../utilities>`.
100+
101+
.. code-block:: bash
102+
103+
#!/bin/bash -x
104+
#BSUB -P <project code>
105+
#BSUB -J libe_mproc
106+
#BSUB -W 60
107+
#BSUB -nnodes 128
108+
#BSUB -alloc_flags "smt1"
109+
110+
# --- Prepare Python ---
111+
112+
# Load conda module and gcc.
113+
module load python
114+
module load gcc
115+
116+
# Name of conda environment
117+
export CONDA_ENV_NAME=my_env
118+
119+
# Activate conda environment
120+
export PYTHONNOUSERSITE=1
121+
source activate $CONDA_ENV_NAME
122+
123+
# --- Prepare libEnsemble ---
124+
125+
# Name of calling script
126+
export EXE=calling_script.py
127+
128+
# Communication Method
129+
export COMMS="--comms local"
130+
131+
# Number of workers.
132+
export NWORKERS="--nworkers 128"
133+
134+
hash -r # Check no commands hashed (pip/python...)
135+
136+
# Launch libE
137+
python $EXE $COMMS $NWORKERS > out.txt 2>&1
138+
139+
With this saved as ``myscript.sh``, allocating, configuring, and queueing
140+
libEnsemble on Summit is achieved by running ::
141+
142+
$ bsub myscript.sh
143+
144+
Example submission scripts are also given in the :doc:`examples<example_scripts>`.
145+
146+
Launching User Applications from libEnsemble Workers
147+
----------------------------------------------------
148+
149+
Only the launch nodes can submit MPI runs to the compute nodes via ``jsrun``.
150+
This can be accomplished in user simulator functions directly. However, it is highly
151+
recommended that the :doc:`Executor<../executor/ex_index>` interface
152+
be used inside the simulator or generator, because this provides a portable interface
153+
with many advantages including automatic resource detection, portability,
154+
launch failure resilience, and ease of use.
155+
156+
.. _conda: https://conda.io/en/latest/
157+
.. _LSF: https://www.olcf.ornl.gov/wp-content/uploads/2018/12/summit_workshop_fuson.pdf
158+
.. _mpi4py: https://mpi4py.readthedocs.io/en/stable/

docs/running_libE.rst

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,10 @@ supercomputers.
6666
from app-launches (if running libEnsemble on a compute node),
6767
set ``libE_specs["dedicated_mode"] = True``.
6868

69+
This mode can also be used to run on a **launch** node of a three-tier
70+
system (e.g., Summit), ensuring the whole compute-node allocation is available for
71+
launching apps. Make sure there are no imports of ``mpi4py`` in your Python scripts.
72+
6973
Note that on macOS (since Python 3.8) and Windows, the default multiprocessing method
7074
is ``"spawn"`` instead of ``"fork"``; to resolve many related issues, we recommend placing
7175
calling script code in an ``if __name__ == "__main__":`` block.
@@ -100,6 +104,9 @@ supercomputers.
100104
(see :doc:`Balsam<executor/balsam_2_executor>`). This nesting does work
101105
with MPICH_ and its derivative MPI implementations.
102106

107+
It is also unsuitable to use this mode when running on the **launch** nodes of
108+
three-tier systems (e.g., Summit). In that case ``local`` mode is recommended.
109+
103110
.. tab-item:: TCP Comms
104111

105112
Run the Manager on one system and launch workers to remote

libensemble/resources/platforms.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -153,6 +153,16 @@ class Frontier(Platform):
153153
scheduler_match_slots: bool = False
154154

155155

156+
class Summit(Platform):
157+
mpi_runner: str = "jsrun"
158+
cores_per_node: int = 42
159+
logical_cores_per_node: int = 168
160+
gpus_per_node: int = 6
161+
gpu_setting_type: str = "option_gpus_per_task"
162+
gpu_setting_name: str = "-g"
163+
scheduler_match_slots: bool = False
164+
165+
156166
# Example of a ROCM system
157167
class GenericROCm(Platform):
158168
mpi_runner: str = "mpich"
@@ -236,13 +246,15 @@ class Known_platforms(BaseModel):
236246
perlmutter_c: PerlmutterCPU = PerlmutterCPU()
237247
perlmutter_g: PerlmutterGPU = PerlmutterGPU()
238248
polaris: Polaris = Polaris()
249+
summit: Summit = Summit()
239250

240251

241252
# Dictionary of known systems (or system partitions) detectable by domain name
242253
detect_systems = {
243254
"frontier.olcf.ornl.gov": "frontier",
244255
"hostmgmt.cm.aurora.alcf.anl.gov": "aurora",
245256
"hsn.cm.polaris.alcf.anl.gov": "polaris",
257+
"summit.olcf.ornl.gov": "summit", # Need to detect gpu count
246258
}
247259

248260

0 commit comments

Comments
 (0)