Versionen im Vergleich

Schlüssel

  • Diese Zeile wurde hinzugefügt.
  • Diese Zeile wurde entfernt.
  • Formatierung wurde geändert.

...

General information for all Lise partitions you can find for the topics

...

Login authentication is possible via SSH keys only. Please visit Quickstart Usage Guide.

Generic login nameList of login nodes
bgnlogin.nhr.zib.de

bgnlogin1.nhr.zib.de   bgnlogin2.nhr.zib.de

...

  • Login and compute nodes of the A100 GPU partition are running under Rocky Linux (currently version 8.6).
  • Software for the A100 GPU partition provided by NHR@ZIB can be found using the module command, see QuickstartUsage Guide.
  • Please note the presence of the sw.a100 environment module. It controls the software selection for the GPU A100 partition.

...

Codeblock
languagetext
titleGPU job script
#!/bin/bash
#SBATCH --partition=gpu-a100
#SBATCH --nodes=2
#SBATCH --ntasks=8 
#SBATCH --gres=gpu:4

module load openmpi/gcc.11/4.1.4
mpirun ./mycode.bin

Container

Apptainer is provided as a module and can be used to download, build and run e.g. Nvidia containers:

Codeblock
languagebash
titleApptainer example
bgnlogin1 ~ $ module load apptainer
Module for Apptainer 1.1.6 loaded.

#pulling a tensorflow image from nvcr.io - needs to be compatible to local driver
bgnlogin1 ~ $ apptainer pull tensorflow-22.01-tf2-py3.sif docker://nvcr.io/nvidia/tensorflow:22.01-tf2-py3
...

#example: single node run calling python from the container in interactive job using 4 GPUs
bgnlogin1 ~
$ srun -pgpu-a100 --gres=gpu:4 --nodes=1 --pty --interactive --preserve-env ${SHELL}
...
bgn1003 ~ $ apptainer run --nv tensorflow-22.01-tf2-py3.sif python
...
Python 3.8.10 (default, Nov 26 2021, 20:14:08) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> tf.config.list_physical_devices("GPU")
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:1', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:2', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:3', device_type='GPU')]

#optional: cleanup apptainer cache
bgnlogin1 ~ $ apptainer cache list
...
bgnlogin1 ~ $ apptainer cache clean