Zum Ende der Metadaten springen
Zum Anfang der Metadaten

Sie zeigen eine alte Version dieser Seite an. Zeigen Sie die aktuelle Version an.

Unterschiede anzeigen Seitenhistorie anzeigen

« Vorherige Version anzeigen Version 36 Nächste Version anzeigen »

The following GPU partitions are available on the GPU partition of system Lise.

Partition nameNodesGPUs per nodeGPU hardwareDescription
gpu-a100364NVIDIA Tesla A100 80GB full node exclusive
gpu-a100:shared54NVIDIA Tesla A100 80GB shared node access, exclusive use of the requested GPUs
gpu-a100:shared:mig128 (4 x 7)1 to 28 1g.10gb A100 MIG slices

shared node access, shared GPU devices via Multi Instance GPU. Each of the four GPUs is logically split into usable seven slices with 10 GB of GPU memory associated to each slice

Cost: 150 core hours per GPU or 21.43 per MIG slice


Lise (Berlin)

Partition (number holds cores per node)

Node nameMax. walltimeNodesMax. nodes
per job

Max jobs (running/ queued)
per user

Usable memory MB per node

CPU

Shared

Charged core-hours per node

Remark
gpu-a100bgn#24:00:00363616 / 500

1 000 000

Ice Lake 8360Y600

4 A100 GPUs

gpu-a100:sharedbgn#24:00:005516 / 500

1 000 000

Ice Lake 8360Y150 per GPU

4 A100 GPUs

gpu-a100:shared:migbgn#24:00:001116 / 500

1 000 000

Ice Lake 8360Y21.43 per MIG slice

4 A100 GPUs with 7 1g10gb mig slices per GPU

12 hours are too short? See here how to pass the 12h walltime limit with job dependencies.



Example: Exclusive usage of two nodes with 4 GPUs each
$ srun --nodes=2 --gres=gpu:4 --partition=gpu-a100 example_cmd


Example: Request two GPUs within the shared partition
# Note: The two GPUs may be located on different nodes.
$ srun --gpus=2 --partition=gpu-a100:shared example_cmd

# Note: Two GPUs on the same node.
$ srun --nodes=1 --gres=gpu:2 --partition=gpu-a100:shared example_cmd


Example: Request a single Multi Instance GPU slice on the according Slurm partition
$ srun --gpus=1 --partition=gpu-a100:shared:mig example_cmd

Hardware configuration

NHR@ZIB offers access to compute nodes equipped with Nvidia A100 GPUs. The GPU A100 partition consists of two login nodes and 42 compute nodes with the following properties for a single node:

  • 2x Intel Xeon "Ice Lake" Platinum 8360Y (36 cores per socket, 2.4 GHz, 250 W)

  • 1 TB RAM (DDR4-3200)
  • 4x Nvidia A100 (80GB HBM2, SXM), two attached to each CPU socket
  • 7.68 TB NVMe local SSD
  • 200 GBit/s InfiniBand Adapter (Mellanox MT28908).

The hardware of the login nodes nodes is similar to those of the A100 GPU compute nodes. Notable exceptions are reduced memory (512 GB instead of 1 TB RAM) and no GPUs (no CUDA drivers) on bgnlogin[1-2].

  • Keine Stichwörter