Versionen im Vergleich

Schlüssel

  • Diese Zeile wurde hinzugefügt.
  • Diese Zeile wurde entfernt.
  • Formatierung wurde geändert.
Inhalt
stylenone

Preface

CentOS 7 has reached its end of life. For this reason the operating system (OS) of Lise's CPU partition will be updated to Rocky Linux 9. This affects all login and compute nodes equipped with Intel Xeon Cascade Lake processors ("clx" for short). Lise's GPU partitions ("a100" and "pvc") are not affected.

...

SLURM partitions

CentOS 7

Rocky Linux 9

standard96

cpu-clx

standard96:test

cpu-clx:test

standard96:ssd

cpu-clx:ssd

large96

cpu-clx:large

large96:test

large96:shared

huge96

cpu-clx:huge

available not available/closed

What remains unchanged

  • node hardware and node names

  • communication network (Intel Omnipath)

  • file systems (HOME, WORK, PERM) and disk quotas

  • environment modules system (still based on Tcl, a.k.a. “Tmod”)

  • access credentials (user IDs, SSH keys) and project IDs

  • charge rates and CPU time accounting (early migrators' jobs are free of charge)

  • Lise’s Nvidia-A100 and Intel-PVC partitions

...

  • For users of SLURM’s srun job launcher:
    Open MPI 5.x has dropped support for the PMI-2 API, it solely depends on PMIx to bootstrap MPI processes. For this reason the environment setting was changed from SLURM_MPI_TYPE=pmi2 to SLURM_MPI_TYPE=pmix, so binaries linked against Open MPI can be started as usual “out of the box” using srun mybinary. For the case of a binary linked against Intel-MPI, this works too when a recent version (≥2021.11) of Intel-MPI has been used. If an older version of Intel-MPI has been used, and relinking/recompiling is not possible, one can follow the workaround for PMI-2 with srun as described in the Q&A section below. Switching from srun to mpirun instead should also be considered.

  • Using more processes per node than available physical cores (PPN > 96; hyperthreads) with the OPX provider:
    The OPX provider currently does not support using hyperthreads/PPN > 96 on the clx partitions. Doing so may result in segmentation faults in libfabric during process startup. If a high number of PPN is really required, the default libfabric has to be changed to PSM2 by setting FI_PROVIDER=psm2. Note that the usage of hyperthreads may not advisable. We encourage users to test performance before using more threads than available physical cores.

Action items for users

All users of Lise are recommended to

...