GPU PVC partition

Content


This partition offers access to eight nodes, each equipped with four Intel Data Center GPU Max 1550 (formerly known as Ponte Vecchio aka PVC). The following offers more details on their usage.

More PVC related content

Hardware Overview

Property

Login Node

Compute Nodes

Property

Login Node

Compute Nodes

Count

2

8

CPU

2x Intel(R) Xeon(R) Platinum 8480L (Sapphire Rapids; 56 cores; 105 MB L3 cache)

RAM

512 GB

1024 GB (1 TB)

Local Storage

2x 1.8 TB SATA SSD

1x 3.6 TB NVMe drive

GPUs

none

4 x Intel Data Center GPU Max 1550 (128 GB HBM; Xe links with all-to-all topology between GPUs)

2 per Socket/NUMA domain

Fabric

InfiniBand HDR 200 GBit/s (1 HFI per Node; on NUMA domain 0)

Operating System

Rocky Linux 8

Note that a single GPU Max 1550 is comprised of two “tiles” or “stacks” which can be considered as NUMA domains. Depending on your workload you may restrict your application to a tile or make use of a full GPU.

Login Nodes

Login to the GPU PVC partition is possible through dedicated login nodes, reachable via SSH under bgilogin.nhr.zib.de:

$ ssh -i $HOME/.ssh/id_rsa_nhr nhr_username@bgilogin.nhr.zib.de Enter passphrase for key '...': bgilogin1 $

File systems

The file systems HOME and WORK on the GPU system are the same as on the CPU system, see Quickstart. Access to compute node local SSD space is provided via the environment variable LOCAL_TMPDIR defined during a SLURM session (batch or interactive job).

Software and environment modules

Login and compute nodes of the PVC GPU partition are running under Rocky Linux.

Software for the PVC GPU partition provided by NHR@ZIB can be found using the module command, see Quickstart.

bgilogin1 ~ $ module avail ... gcc/13.2.0 intel/2024.0.0 ... bgilogin1 ~ $ module load intel Module for Intel oneAPI Compilers and Libraries (next generation compiler version 2024.0.0) loaded. bgnlogin1 ~ $ module list Currently Loaded Modulefiles: 1) HLRNenv 2) sw.pvc 3) slurm 4) intel/2024.0.0
bgilogin1 ~ $ module avail ... gcc/13.2.0 intel/2024.0.0 ... bgilogin1 ~ $ module load intel Module for Intel oneAPI Compilers and Libraries (next generation compiler version 2024.0.0) loaded. bgnlogin1 ~ $ module list Currently Loaded Modulefiles: 1) HLRNenv 2) sw.pvc 3) slurm 4) intel/2024.0.0

 Please note the presence of the sw.pvc environment module. When loaded, environment modules are shown for software installed for the PVC GPU partition. This is the default setting on the PVC GPU login and compute nodes.

When compiling applications for the PVC GPU partition, we recommend to use the PVC GPU login nodes or, in case of really demanding compilations and/or need for the presence of the GPU drivers, the use of a PVC GPU compute node via an interactive SLURM job session.

Slurm Partitions

Partition Name

Nodes

GPUs per Node

GPU Hardware

Description

Partition Name

Nodes

GPUs per Node

GPU Hardware

Description

gpu-pvc

8

4

Intel Data Center GPU Max 1550

full node exclusive

Example usage of two nodes (eight GPUs in total). Note that is currently not required to request GPU resources via Slurm for using nodes/GPUs of the PVC partition.

Charge rates

Charge rates for the slurm partitions you find in Accounting. The PVC partitions are currently available free of charge.