Skip to content

Running GPU Jobs on the ORC Clusters

Available GPU Resources on Hopper

The GPU resources on Hopper currently include 2 DGX A100 40GB nodes each with 8 A100.40GB GPUs and 24 A100 80GB nodes each with 4 A100.80GB GPUs. 8 of the A100.80GB GPUs are further partitioned into MIG slices to increase even further the number of gpu instances that can be started on Hopper.

GPU partitioning

The A100.80GB nodes have 4 NVIDIA Tesla A100 GPUs which can be further partitioned into smaller slices to optimize access and utilization. For example, each GPU can be sliced into as many as 7 instances when enabled to operate in MIG (Multi-Instance GPU) mode. The MIG mode is a feature that allows a single GPU to be partitioned into multiple instances, each with their own dedicated resources. This enables multiple users or applications to share a single GPU, improving overall utilization and efficiency.

The following table outlines three MIG partition types with varying resource allocations: MIG 1g.10gb has a 1/8 memory fraction, 1/7 fraction of Streaming Multiprocessors (SMs), no NVDEC (NVIDIA Decoder) hardware units, 1/8 L2 cache size, and 8 nodes; MIG 2g.20gb features a 2/8 memory fraction, 2/7 fraction of Streaming Multiprocessors (SMs), 1 NVDEC (NVIDIA Decoder) hardware unit, 2/8 L2 cache size, and 4 nodes; MIG 3g.40gb comes with a 4/8 memory fraction, 3/7 fraction of Streaming Multiprocessors (SMs), 2 NVDEC (NVIDIA Decoder) hardware units, 4/8 L2 cache size, and 4 nodes.

GPU Instance Profiles on A100 Profile

Name Fraction of Memory Fraction of SMs Hardware Units L2 Cache Size Number of Nodes Total Available
MIG 1g.10gb 1/8 1/7 0 NVDECs 1/8 8 64
MIG 2g.20gb 2/8 2/7 1 NVDECs 2/8 4 32
MIG 3g.40gb 4/8 3/7 2 NVDECs 4/8 4 32

To make the most of the GPUs on Hopper, it is essential to evaluate your job's requirements and select the appropriate GPU slice based on availability and suitability. For instance, if your simulation demands minimal GPU memory, a MIG 1g.10gb slice (providing 10GB of GPU memory) would be more suitable, reserving larger slices for jobs with higher memory needs. In the context of machine learning, training tasks generally require more computation and memory, making a full GPU node or a larger slice like MIG 3g.40gb ideal, while inference tasks can be efficiently executed on smaller slices like MIG 1g.10gb or MIG 2g.20gb.

Our cluster currently offers 32 MIG 3g.40gb partitions, 32 MIG 2g.20gb partitions, and 64 MIG 1g.10gb partitions. This configuration ensures the most efficient use of our limited GPU resources. MIG technology enables better resource allocation and allows for more diverse workloads to be executed simultaneously, enhancing the overall performance and productivity of the cluster. The partitioning of GPU nodes is expected to evolve over time, optimizing resource utilization.

Running GPUs jobs

GPU jobs can be run either from a shell session or from the Open OnDemand web dashboard.

GPU jobs from Open OnDemad

After logging into Open OnDemand, select the app you want to run on and complete the resource configuration table. To run your job on any of the available GPU resources, you need to select the 'GPU' or 'Contrib GPU' for the partition:

You also need to set the correct GPU size depending on your jobs needs:

After setting the additional options, your app will start on the selected GPU once you launch it.

GPU Jobs with SLURM

To run on the GPUs with SLURM, you need to set the correct PARTITION, QOS and GRES option when defining your SLURM parameters.

The Partition and QOS respectively are set with: - Partition:

#SBATCH --partition=gpuq


#SBATCH --partition=contrib-gpuq

The contrib-gpuq partition can be used by all, but runs from accounts that are not Hopper GPU node contributors will be open to pre-emption.

  • QOS:
#SBATCH --qos=gpu

You need to combine the partition and qos to run on the gpu nodes.

You also need to set the type and number of GPUs you need to use with the gres parameter. The available GPU GRES options are show in the following table:

Type of GPU SLURM setting No. of GPUs on Node No. of CPUs RAM
1g 10GB --gres=gpu:1g.10gb:nGPUs 4 64 500GB
2g 20GB --gres=gpu:2g.20gb:nGPUs 4 64 500GB
3g 40GB --gres=gpu:3g.40gb:nGPUs 4 64 500GB
A100 80GB --gres=gpu:A100.80gb:nGPUS 4 64 500GB
DGX A100 40GB --gres=gpu:A100.40gb:nGPUs 8 128 1TB

You would modify your SLURM options to make sure that you are requesting a suitable GPU slice.

GPU runs with SLURM can be made wither interactively, directly on the gpu node or in batch mode with a SLURM script.

Working interactively on a GPU

You start an interactive session on a GPU node with the salloc command:

salloc -p gpuq -q gpu --nodes=1 --ntasks-per-node=12 --gres=gpu:1g.10gb:1 --mem=15gb -t 0-02:00:00 

This command will allocate you the specified gpu resources (a 1g.10gb MIG instance) 12 cores and 15GB of memory for 2 hours on the gpu node. Once the resources become available, your prompt should now show that you're on one of the Hopper nodes.

salloc: Granted job allocation 
salloc: Waiting for resource configuration
salloc: Nodes amd021 are ready for job
[user@amd021 ~]$

Once allocated, this will give you direct acces to the gpu instance where you can then work interactively from the command line. Modules you loaded while on the head nodes are exported onto the node as well. If you had not already loaded any modules, you should be able to load them now as well. To check the currently loaded modules on the node use the command shown below :

$ module avail

The interactive session will persist until you type the 'exit' command as shown below:

$ exit

salloc: Relinquishing job allocation 
Ideally, you should use interactive sessions directly on the GPU node for intensive tests that cannot otherwise be run on the head nodes. Be aware that exiting the interactive session will end any runs that were started in the session.

Using a SLURM Submission Script

Once your tests are done and you're ready to run longer jobs, you should now switch to using the batch submission with SLURM. To do this, you write a SLURM script setting the different parameters for your job, loading the necessary modules, and executing your Python script which is then submitted to the selected queue from where it will run your job. Below is an example SLURM script (run.slurm) for a Python job on the GPU nodes. In the script, the partition is set to gpuq and number of GPU nodes needed is set to 1:

#SBATCH --partition=gpuq                    # need to set 'gpuq' or 'contrib-gpuq'  partition
#SBATCH --qos=gpu                           # need to select 'gpu' QOS or other relvant QOS
#SBATCH --job-name=python-gpu
#SBATCH --output=/scratch/%u/%x-%N-%j.out   # Output file
#SBATCH --error=/scratch/%u/%x-%N-%j.err    # Error file
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1                 # number of cores needed
#SBATCH --gres=gpu:1g.10gb:1                # up to 8; only request what you need
#SBATCH --mem-per-cpu=3500M                 # memory per CORE; total memory is 1 TB (1,000,000 MB)
#SBATCH --export=ALL 
#SBATCH --time=0-02:00:00                   # set to 2hr; please choose carefully

set echo
umask 0027

# to see ID and state of GPUs assigned

module load gnu10                           
module load python


Preferably, use the scratch space to submit your job's SLURM script with

sbatch run.slurm
To access scratch space use cd /scratch/UserID command to change directories(replace 'UserId' with your GMU GMUnetID). Please note that scratch directories have no space limit and data in /scratch gets purged 90 days from the date of creation, so make sure to move your files to a safe place before the purge.

To copy files directly from home or scratch to your projects or other space you can use the cp command to create a copy of the contents of the file or directory specified by the SourceFile or SourceDirectory parameters into the file or directory specified by the TargetFile or TargetDirectory parameters. The cp command also copies entire directories into other directories if you specify the -r or -R flags. The command below copies entire files from the scratch space to your project space (" /projects/orctest" as shown in the example below, where " /projects/orctest" is a project space)

[UserId@hopper2 ~]$ cd /scratch/UserId
[UserId@hopper2 UserId]$ cp -p -r *  /projects/orctest