You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/hpc/13_tutorial_intro_hpc/01_intro_hpc.mdx
+7-3Lines changed: 7 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,14 +3,18 @@
3
3
This tutorial is an introduction to using the Greene high-performance computing systems at NYU effectively. It is not intended to be an exhaustive course on parallel programming. The goal is to give new users of Greene an introduction and overview of the tools available and how to use them effectively.
4
4
5
5
:::warning[Prerequisites]
6
-
Command line experience is necessary for this lesson. We recommend the participants to go through our [Introduction to Using the Shell on Greene](../12_tutorial_intro_shell_hpc/01_intro.mdx), if new to the command line (also known as terminal or shell).
6
+
Command line experience is necessary for this lesson. We recommend participants go through our [Introduction to Using the Shell on Greene](../12_tutorial_intro_shell_hpc/01_intro.mdx) tutorial, if new to the command line (also known as terminal or shell).
7
7
:::
8
8
9
9
:::note[Objectives]
10
-
By the end of this workshop, students will know how to:
10
+
By the end of this tutorial, participants will know how to:
11
11
- Identify problems a cluster can help solve
12
12
- Use the UNIX shell (also known as terminal or command line) to connect to a cluster.
13
13
- Transfer files onto a cluster.
14
14
- Submit and manage jobs on a cluster using a scheduler.
15
15
- Observe the benefits and limitations of parallel execution.
16
-
:::
16
+
:::
17
+
18
+
:::info[Provenance]
19
+
This tutorial was adapted for NYU's Greene HPC from [HPC Carpentry](https://github.com/hpc-carpentry)
Copy file name to clipboardExpand all lines: docs/hpc/13_tutorial_intro_hpc/02_why_use_cluster.mdx
+7-7Lines changed: 7 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -17,18 +17,18 @@ How does computing help you do your research? How could more computing help you
17
17
18
18
Frequently, research problems that use computing can outgrow the desktop or laptop computer where they started:
19
19
20
-
- A statistics student wants to do cross-validate their model. This involves running the model 1000 times — but each run takes an hour. Running on their laptop will take over a month!
20
+
- A statistics student wants to cross-validate their model. This involves running the model 1000 times — but each run takes an hour. Running on their laptop will take over a month!
21
21
- A genomics researcher has been using small datasets of sequence data, but soon will be receiving a new type of sequencing data that is 10 times as large. It’s already challenging to open the datasets on their computer — analyzing these larger datasets will probably crash it.
22
22
- An engineer is using a fluid dynamics package that has an option to run in parallel. So far, they haven’t used this option on their desktop, but in going from 2D to 3D simulations, simulation time has more than tripled and it might be useful to take advantage of that feature.
23
23
24
-
In all these cases, what is needed is access to more computers than can be used at the same time. Luckily, large scale computing systems — shared computing resources with lots of computers — are available at many universities, labs, or through national networks. These resources usually have more central processing units(CPUs), CPUs that operate at higher speeds, more memory, more storage, and faster connections with other computer systems. They are frequently called “clusters”, “supercomputers” or resources for “high performance computing” or HPC. In this lesson, we will usually use the terminology of HPC and HPC cluster.
24
+
In all these cases, what is needed is access to more computers than can be used at the same time. Luckily, large scale computing systems — shared computing resources with lots of computers — are available at many universities, labs, or through national networks. These resources usually have more central processing units(CPUs), CPUs that operate at higher speeds, more memory, more storage, and faster connections with other computer systems. They are frequently called “clusters”, “supercomputers” or resources for “high performance computing” or HPC. In this lesson, we will usually use the terminology of HPC and HPC cluster and we focus on NYU's HPC cluster Greene.
25
25
26
26
Using a cluster often has the following advantages for researchers:
27
27
28
-
-**Speed**: With many more CPU cores, often with higher performance specs, than a typical laptop or desktop, HPC systems can offer significant speed up.
28
+
-**Speed**: With many more CPU cores, often with higher performance specs than a typical laptop or desktop, HPC systems can offer significant speed up.
29
29
-**Volume**: Many HPC systems have both the processing memory (RAM) and disk storage to handle very large amounts of data. Terabytes of RAM and petabytes of storage are available for research projects.
30
30
-**Efficiency**: Many HPC systems operate a pool of resources that are drawn on by many users. In most cases when the pool is large and diverse enough the resources on the system are used almost constantly.
31
-
-**Cost**: Bulk purchasing and government funding mean that the cost to the research community for using these systems in significantly less that it would be otherwise.
31
+
-**Cost**: Bulk purchasing and government funding mean that the cost to the research community for using these systems is significantly less that it would be otherwise.
32
32
-**Convenience**: Maybe your calculations just take a long time to run or are otherwise inconvenient to run on your personal computer. There’s no need to tie up your own computer for hours when you can use someone else’s instead.
33
33
34
34
This is how a large-scale compute system like a cluster can help solve problems like those listed at the start of the lesson.
@@ -47,13 +47,13 @@ Learning to use Bash or any other shell sometimes feels more like programming th
47
47
## The rest of this lesson
48
48
The only way to use these types of resources is by learning to use the command line. This introduction to HPC systems has two parts:
49
49
50
-
- We will learn to use the UNIX command line (also known as Bash).
50
+
- We will learn to use the UNIX command line (also known as the Bash shell).
51
51
- We will use our new Bash skills to connect to and operate a high-performance computing supercomputer.
52
52
53
53
The skills we learn here have other uses beyond just HPC: Bash and UNIX skills are used everywhere, be it for web development, running software, or operating servers. It’s become so essential that Microsoft now [ships it as part of Windows](https://apps.microsoft.com/detail/9nblggh4msv6?hl=en-US&gl=US)! Knowing how to use Bash and HPC systems will allow you to operate virtually any modern device. With all of this in mind, let’s connect to a cluster and get started!
54
54
55
55
:::tip[Key Points]
56
56
- High Performance Computing (HPC) typically involves connecting to very large computing systems elsewhere in the world.
57
-
- These HPC systems can be used to do work that would either be impossible or much slower or smaller systems.
58
-
- The standard method of interacting with such systems is via a command line interface such as Bash.
57
+
- These HPC systems can be used to do work that would either be impossible or much slower on smaller systems.
58
+
- The standard method of interacting with such systems is via a command line interface such as the Bash shell.
Copy file name to clipboardExpand all lines: docs/hpc/13_tutorial_intro_hpc/03_exploring_remote_resources.mdx
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,17 +6,17 @@ Questions
6
6
- Are all compute nodes alike?
7
7
8
8
Objectives
9
-
- Survey system resources using nproc, free, and the queuing system
9
+
- Survey system resources using `nproc`, `free`, and the queuing system
10
10
- Compare & contrast resources on the local machine, login node, and worker nodes
11
-
- Learn about the various filesystems on the cluster using df
11
+
- Learn about the various filesystems on the cluster using `df`
12
12
- Find out who else is logged in
13
13
- Assess the number of idle and occupied nodes
14
14
:::
15
15
16
16
## Look Around the Remote System
17
17
If you have not already connected to Greene, please do so now:
18
18
```bash
19
-
[NetID@glogin-1~]$ ssh NetID@greene.hpc.nyu.edu
19
+
[user@laptop~]$ ssh NetID@greene.hpc.nyu.edu
20
20
```
21
21
Take a look at your home directory on the remote system:
22
22
```bash
@@ -47,9 +47,9 @@ Most high-performance computing systems run the Linux operating system, which is
47
47
afs bin@ dev gpfs lib@ media mnt opt root sbin@ share state tmp var
48
48
archive boot etc home lib64@ misc net proc run scratch srv sys usr vast
49
49
```
50
-
The `/home/NetID` directory is the one where we generally want to keep all of our files. Other folders on a UNIX OS contain system files and change as you install new software or upgrade your OS.
50
+
The `/home/NetID`, `/scratch/NetID`, `/archive/NetID`, and `/vast/NetID` directories are created for you by default and they are where you'll probably store most of your files, but there are other options as well. Please see the tip below and our [storage documentation](../03_storage/01_intro_and_data_management.mdx) for details about how these directories differ, as well as other storage options available. Other folders on a UNIX OS contain system files and change as you install new software or upgrade your OS.
51
51
52
-
:::tip[Using HPC filesystems]
52
+
:::tip[Using the HPC filesystems]
53
53
On Geene, you have a number of places where you can store your files. These differ in both the amount of space allocated and whether or not they are backed up.
54
54
55
55
-**Home** – data stored here is available throughout the HPC system, and often backed up periodically. Please note the limit on the number of files (inodes) which can get used up easily. Use the `myquota` command to ensure that you are not running out of inodes!
Copy file name to clipboardExpand all lines: docs/hpc/13_tutorial_intro_hpc/04_scheduler_fundamentals.mdx
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,7 +16,7 @@ Objectives
16
16
## Job Scheduler
17
17
An HPC system might have thousands of nodes and thousands of users. How do we decide who gets what and when? How do we ensure that a task is run with the resources it needs? This job is handled by a special piece of software called the *scheduler*. On an HPC system, the scheduler manages which jobs run where and when.
18
18
19
-
The following illustration compares these tasks of a job scheduler to a waiter in a restaurant. If you can relate to an instance where you had to wait for a while in a queue to get in to a popular restaurant, then you may now understand why sometimes your job do not start instantly as in your laptop.
19
+
The following illustration compares these tasks of a job scheduler to a waiter in a restaurant. If you can relate to an instance where you had to wait for a while in a queue to get in to a popular restaurant, then you may now understand why your job might not start instantly as it does on your laptop.
20
20
21
21

22
22
@@ -184,7 +184,7 @@ JOBID USER ACCOUNT NAME ST REASON START_TIME TIME TIME_LEFT NODES CPUS
184
184
```
185
185
186
186
:::tip[Cancelling multiple jobs]
187
-
We can also all of our jobs at once using the `-u` option. This will delete all jobs for a specific user (in this case us). Note that you can only delete your own jobs.
187
+
We can also cancel all of our jobs at once using the `-u` option. This will delete all jobs for a specific user (in this case us). Note that you can only delete your own jobs.
188
188
189
189
Try submitting multiple jobs and then cancelling them all with `scancel -u NetID`.
190
190
:::
@@ -212,7 +212,7 @@ Sometimes, you will need a lot of resource for interactive use. Perhaps it’s o
212
212
```bash
213
213
srun --pty bash
214
214
```
215
-
You should be presented with a bash prompt. Note that the prompt will likely change to reflect your new location, in this case the compute node we are logged on. You can also verify this with `hostname`.
215
+
You should be presented with a bash prompt. Note that the prompt will likely change to reflect your new location, in this case the compute node we are logged onto. You can also verify this with `hostname`.
216
216
217
217
:::tip[Creating remote graphics]
218
218
To see graphical output inside your jobs, you need to use X11 forwarding. To connect with this feature enabled, use the `-Y` option when you login with `ssh` with the command `ssh -Y username@host`.
Copy file name to clipboardExpand all lines: docs/hpc/13_tutorial_intro_hpc/06_modules.mdx
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,18 +15,18 @@ Before we start using individual software packages, however, we should understan
15
15
- versioning
16
16
- dependencies
17
17
18
-
Software incompatibility is a major headache for programmers. Sometimes the presence (or absence) of a software package will break others that depend on it. Two well known examples are Python and C compiler versions. Python 3 famously provides a `python` command that conflicts with that provided by Python 2. Software compiled against a newer version of the C libraries and then run on a machine that has older C libraries installed will result in a nasty `'GLIBCXX_3.4.20' not found` error.
18
+
**Software incompatibility** is a major headache for programmers. Sometimes the presence (or absence) of a software package will break others that depend on it. Two well known examples are Python and C compiler versions. Python 3 famously provides a `python` command that conflicts with that provided by Python 2. Software compiled against a newer version of the C libraries and then run on a machine that has older C libraries installed will result in a nasty `'GLIBCXX_3.4.20' not found` error.
19
19
20
-
Software versioning is another common issue. A team might depend on a certain package version for their research project - if the software version was to change (for instance, if a package was updated), it might affect their results. Having access to multiple software versions allows a set of researchers to prevent software versioning issues from affecting their results.
20
+
**Software versioning** is another common issue. A team might depend on a certain package version for their research project - if the software version was to change (for instance, if a package was updated), it might affect their results. Having access to multiple software versions allows a set of researchers to prevent software versioning issues from affecting their results.
21
21
22
-
Dependencies are where a particular software package (or even a particular version) depends on having access to another software package (or even a particular version of another software package). For example, the VASP materials science software may depend on having a particular version of the FFTW (Fastest Fourier Transform in the West) software library available for it to work.
22
+
**Dependencies** are where a particular software package (or even a particular version) depends on having access to another software package (or even a particular version of another software package). For example, the VASP materials science software may depend on having a particular version of the FFTW (Fastest Fourier Transform in the West) software library available for it to work.
23
23
24
24
## Environment Modules
25
-
Environment modules are the solution to these problems. A *module* is a self-contained description of a software package – it contains the settings required to run a software package and, usually, encodes required dependencies on other software packages.
25
+
**Environment modules** are the solution to these problems. A *module* is a self-contained description of a software package – it contains the settings required to run a software package and, usually, encodes required dependencies on other software packages.
26
26
27
27
There are a number of different environment module implementations commonly used on HPC systems: the two most common are *TCL modules* and *Lmod*. Both of these use similar syntax and the concepts are the same so learning to use one will allow you to use whichever is installed on the system you are using. In both implementations the `module` command is used to interact with environment modules. An additional subcommand is usually added to the command to specify what you want to do. For a list of subcommands you can use `module -h` or `module help`. As for all commands, you can access the full help on the *man* pages with `man module`.
28
28
29
-
On login you may start out with a default set of modules loaded or you may start out with an empty environment; this depends on the setup of the system you are using.
29
+
On login to Greene you will start out with an empty environment.
30
30
31
31
# Listing Available Modules
32
32
To see available software modules, use `module avail`:
@@ -57,7 +57,7 @@ No modules loaded
57
57
```
58
58
59
59
## Loading and Unloading Software
60
-
To load a software module, use `module load`. In this example we will use R.
60
+
To load a software module, use `module load`. In this example we will use `R`.
61
61
62
62
Initially, R is not loaded. We can test this by using the `which` command. `which` looks for programs the same way that Bash does, so we can use it to tell us where a particular piece of software is stored.
0 commit comments