CUDA Jobs on Hopper

Compiling and running CUDA jobs on the cluster

To compile CUDA code you need to use the nvcc compiler which comes with the CUDA toolkit. To use the nvcc command you first need to load the cuda/9.X module.

Compiling CUDA Code

A simple way to compile a single cuda program is to use the nvcc command:

nvcc sample.cu -o executable_name

More info can be found here. REMINDER: You need to have the cuda/9.X module loaded before using this command.

Running A CUDA Job

GPUs are treated as generic resources by Slurm. You need to request the number of GPUs you require in your SLURM wrapper script. Even if you need only one GPU, you must use this option to request the one GPU card. You also need to request the GPU partition in the SLURM script.

Below is a sample job script, that asks for 4 GPUs.

#!/bin/bash

#SBATCH --job-name caffe
#SBATCH --partition gpuq
#SBATCH --nodes 1
#SBATCH --ntasks-per-node 1
#SBATCH --gres=gpu:4

# Load necessary modules
module load cuda/9.2

# List the current modules loaded
module list

# Your program executable or script goes here
./YOUR_PROGRAM