CUDA (Compute Unified Device Architecture) is an interface to program Nvidia GPUs. It offers support to the languages such as C, C++, and Fortran.
...
- a GPU A100 login node, like bgnlogin.nhr.zib.de.
- see also GPU A100 partition
Code build
For code generation we recommend the software package NVIDIA hpcx which is a combination of compiler and powerful libraries, like e.g. CUDA, cublas, and MPI.
Codeblock |
---|
language | text |
---|
title | CUDA and with cublas |
---|
|
bgnlogin1 $ module load nvhpc-hpcx/23.1
bgnlogin1 $ module list
Currently Loaded Modulefiles: ... 4) hpcx 5) nvhpc-hpcx/23.1
bgnlogin1 $ nvc -cuda -gpu=cc8.0 cuda.c -o cuda.bin
bgnlogin1 $ nvc -cuda -gpu=cc8.0 -cudalib=cublas cuda_cublas.c -o cuda_cublas.bin |
CUDA can be used in combination with MPI.
Codeblock |
---|
language | text |
---|
title | CUDA with MPI |
---|
|
bgnlogin1 $ module load nvhpc-hpcx/23.1
bgnlogin1 $ mpicc -cuda -gpu=cc8.0 -cudalib=cublas mpi_cuda_cublas.c -o mpi_cuda_cublas.bin |
Code execution
All available slurm partitions for the A100 GPU partition you can see on Slurm partitions GPU A100.
Codeblock |
---|
language | text |
---|
title | Job script for CUDA |
---|
|
#!/bin/bash
#SBATCH --partition=gpu-a100:shared
#SBATCH --gres=gpu:1
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=72
./cuda.bin
./cuda_cublas.bin
|
...
language | text |
---|
title | Job script for CUDA with MPI |
---|
...
Apptainer is provided as a module and can be used to download, build and run e.g. Nvidia containers:
Codeblock |
---|
language | bash |
---|
title | Apptainer example |
---|
|
bgnlogin1 ~ $ module load apptainer
Module for Apptainer 1.1.6 loaded.
#pulling a tensorflow image from nvcr.io - needs to be compatible to local driver
bgnlogin1 ~ $ apptainer pull tensorflow-22.01-tf2-py3.sif docker://nvcr.io/nvidia/tensorflow:22.01-tf2-py3
...
#example: single node run calling python from the container in interactive job using 4 GPUs
bgnlogin1 ~ $ srun -pgpu-a100 --gres=gpu:4 --nodes=1 --pty --interactive --preserve-env ${SHELL}
...
bgn1003 ~ $ apptainer run --nv tensorflow-22.01-tf2-py3.sif python
...
Python 3.8.10 (default, Nov 26 2021, 20:14:08)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> tf.config.list_physical_devices("GPU")
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:1', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:2', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:3', device_type='GPU')]
#optional: cleanup apptainer cache
bgnlogin1 ~ $ apptainer cache list
...
bgnlogin1 ~ $ apptainer cache clean |