/
CUDA

CUDA

CUDA (Compute Unified Device Architecture) is an interface to program Nvidia GPUs. It offers support to the languages such as C, C++, and Fortran.

To build and execute code on the GPU A100 partition, please login to

Note, that codes written in the cross-industry standard language SYCL can be executed on Nvidia (and AMD) hardware.

Code build

For code generation we recommend the software package NVIDIA hpcx which is a combination of compiler and powerful libraries, like e.g. CUDA, cublas, and MPI.

CUDA and with cublas
bgnlogin1 $ module load nvhpc-hpcx/23.1
bgnlogin1 $ module list
Currently Loaded Modulefiles: ... 4) hpcx   5) nvhpc-hpcx/23.1
bgnlogin1 $ nvc -cuda -gpu=cc8.0 cuda.c -o cuda.bin
bgnlogin1 $ nvc -cuda -gpu=cc8.0 -cudalib=cublas cuda_cublas.c -o cuda_cublas.bin

CUDA can be used in combination with MPI.

CUDA with MPI
bgnlogin1 $ module load nvhpc-hpcx/23.1
bgnlogin1 $ mpicc -cuda -gpu=cc8.0 -cudalib=cublas mpi_cuda_cublas.c -o mpi_cuda_cublas.bin

Code execution

All available slurm partitions for the A100 GPU partition you can see on Slurm partition GPU A100.

Job script for CUDA
#!/bin/bash
#SBATCH --partition=gpu-a100:shared
#SBATCH --gres=gpu:1
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=72

./cuda.bin
./cuda_cublas.bin
Job script for CUDA with MPI
#!/bin/bash
#SBATCH --partition=gpu-a100
#SBATCH --gres=gpu:4
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=72

module load nvhpc-hpcx/23.1
mpirun --np 8 --map-by ppr:2:socket:pe=1 ./mpi_cuda_cublas.bin

Related content

OpenMP for GPU A100
OpenMP for GPU A100
More like this
PVC GPU Programming
PVC GPU Programming
More like this
TensorFlow
TensorFlow
More like this
Slurm partition GPU A100
Slurm partition GPU A100
More like this
Parallel programming day 01-2024
Parallel programming day 01-2024
More like this
PyTorch
More like this