Zum Ende der Metadaten springen
Zum Anfang der Metadaten

Sie zeigen eine alte Version dieser Seite an. Zeigen Sie die aktuelle Version an.

Unterschiede anzeigen Seitenhistorie anzeigen

« Vorherige Version anzeigen Version 8 Nächste Version anzeigen »

Login

Login to the GPU A100 partition is possible through dedicated login nodes, reachable via SSH under bgnlogin.nhr.zib.de:

Example: login
$ ssh -i $HOME/.ssh/id_rsa_zib zib_username@bgnlogin.nhr.zib.de
Enter passphrase for key '/<home_directory>/.ssh/id_rsa_zib':
bgnlogin1$

File systems

The file systems HOME and WORK on the GPU system are the same as on the CPU system, see Quickstart. Access to compute node local SSD space is provided via the environment variable LOCAL_TMPDIR defined during a SLURM session (batch or interactive job).

Software and environment modules

Login and compute nodes of the A100 GPU partition are running under Rocky Linux (currently version 8.6).

Software for the A100 GPU partition provided by NHR@ZIB can be found using the module command, see Quickstart.

Example: Show the currently available software and access compilers
bgnlogin1 ~ $ module avail
...
bgnlogin1 ~ $ module load gcc
...
bgnlogin1 ~ $ module list
Currently Loaded Modulefiles:
 1) HLRNenv   2) sw.a100   3) slurm   4) gcc/11.3.0(default)

(Glühbirne) Please note the presence of the sw.a100 environment module. When loaded, environment modules are shown for software installed for the NVidia A100 GPU partition. This is the default setting on the A100 GPU login and compute nodes.

When compiling applications for the A100 GPU partition, we recommend to use the A100 GPU login nodes or, in case of really demanding compilations and/or need for the presence of CUDA drivers, the use of a A100 GPU compute node via an interactive SLURM job session.

Using the batch system

The GPU nodes are available via partitions of the batch system slurm.

Lise's CPU-only partition and the A100 GPU partition share the same SLURM batch system. The main SLURM partition for the A100 GPU partition has the name "gpu-a100". An example job script is shown below.

GPU job script
#!/bin/bash
#SBATCH --partition=gpu-a100
#SBATCH --nodes=2
#SBATCH --ntasks=8 
#SBATCH --gres=gpu:4

module load openmpi/gcc.11/4.1.4
mpirun ./mycode.bin

GPU-aware MPI

For efficient use of MPI-distributed GPU codes, an GPU/CUDA-aware MPI installation of Open MPI is available in the openmpi/gcc.11/4.1.4 environment module. Open MPI respects the resource requests made to Slurm. Thus, no special arguments are required to mpiexec/run. Nevertheless, please consider and check the correct binding for your application to CPU cores and GPUs. Use --report-bindings of mpiexec/run to check it.

  • Keine Stichwörter