/
CUDA
CUDA
CUDA (Compute Unified Device Architecture) is an interface to program Nvidia GPUs. It offers support to the languages such as C, C++, and Fortran.
To build and execute code on the GPU A100 partition, please login to
- a GPU A100 login node, like bgnlogin.nhr.zib.de.
- see also GPU A100 partition
Note, that codes written in the cross-industry standard language SYCL can be executed on Nvidia (and AMD) hardware.
Code build
For code generation we recommend the software package NVIDIA hpcx which is a combination of compiler and powerful libraries, like e.g. CUDA, cublas, and MPI.
CUDA and with cublas
bgnlogin1 $ module load nvhpc-hpcx/23.1 bgnlogin1 $ module list Currently Loaded Modulefiles: ... 4) hpcx 5) nvhpc-hpcx/23.1 bgnlogin1 $ nvc -cuda -gpu=cc8.0 cuda.c -o cuda.bin bgnlogin1 $ nvc -cuda -gpu=cc8.0 -cudalib=cublas cuda_cublas.c -o cuda_cublas.bin
CUDA can be used in combination with MPI.
CUDA with MPI
bgnlogin1 $ module load nvhpc-hpcx/23.1 bgnlogin1 $ mpicc -cuda -gpu=cc8.0 -cudalib=cublas mpi_cuda_cublas.c -o mpi_cuda_cublas.bin
Code execution
All available slurm partitions for the A100 GPU partition you can see on Slurm partition GPU A100.
Job script for CUDA
#!/bin/bash #SBATCH --partition=gpu-a100:shared #SBATCH --gres=gpu:1 #SBATCH --nodes=1 #SBATCH --ntasks-per-node=72 ./cuda.bin ./cuda_cublas.bin
Job script for CUDA with MPI
#!/bin/bash #SBATCH --partition=gpu-a100 #SBATCH --gres=gpu:4 #SBATCH --nodes=2 #SBATCH --ntasks-per-node=72 module load nvhpc-hpcx/23.1 mpirun --np 8 --map-by ppr:2:socket:pe=1 ./mpi_cuda_cublas.bin
, multiple selections available,
Related content
OpenMP for GPU A100
OpenMP for GPU A100
More like this
PVC GPU Programming
PVC GPU Programming
More like this
TensorFlow
TensorFlow
More like this
Slurm partition GPU A100
Slurm partition GPU A100
More like this
Parallel programming day 01-2024
Parallel programming day 01-2024
More like this
PyTorch
PyTorch
More like this