CUDA Jobs on Hopper
Compiling and running CUDA jobs on the cluster
To compile CUDA code you need to use the nvcc
compiler which comes
with the CUDA toolkit. To use the nvcc
command you first need to load
the cuda/9.X
module.
Compiling CUDA Code
A simple way to compile a single cuda program is to use the nvcc
command:
nvcc sample.cu -o executable_name
cuda/9.X
module loaded before using
this command.
Running A CUDA Job
GPUs are treated as generic resources by Slurm. You need to request the number of GPUs you require in your SLURM wrapper script. Even if you need only one GPU, you must use this option to request the one GPU card. You also need to request the GPU partition in the SLURM script.
Below is a sample job script, that asks for 4 GPUs.
#!/bin/bash
#SBATCH --job-name caffe
#SBATCH --partition gpuq
#SBATCH --nodes 1
#SBATCH --ntasks-per-node 1
#SBATCH --gres=gpu:4
# Load necessary modules
module load cuda/9.2
# List the current modules loaded
module list
# Your program executable or script goes here
./YOUR_PROGRAM