Content
General information for all Lise partitions you can find for the topics
The GPU A100 partition offers access to two login nodes and 42 compute nodes equipped with Nvidia A100 GPUs. One single compute node holds the following properties.
2x Intel Xeon "Ice Lake" Platinum 8360Y (36 cores per socket, 2.4 GHz, 250 W)
The hardware of the login nodes nodes is similar to those of the compute nodes. Notable differences to the compute nodes are
Login authentication is possible via SSH keys only. Please visit Usage Guide.
Generic login name | List of login nodes |
---|---|
bgnlogin.nhr.zib.de | bgnlogin1.nhr.zib.de bgnlogin2.nhr.zib.de |
bgnlogin1 $ module avail ... bgnlogin1 $ module load gcc ... bgnlogin1 $ module list Currently Loaded Modulefiles: 1) HLRNenv 2) sw.a100 3) slurm 4) gcc/11.3.0(default) |
openmpi/gcc.11/4.1.4
environment module. Open MPI respects the resource requests made to Slurm. Thus, no special arguments are required to mpiexec/run
. Nevertheless, please consider and check the correct binding for your application to CPU cores and GPUs. Use --report-bindings
of mpiexec/run to check it.A running job can be monitored interactively, directly on each of the compute nodes. Once you know the names of the job nodes you can login and monitor the host CPU as well as the GPUs.
bgnlogin1 $ squeue -u myaccount JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 7748370 gpu-a100 a100_mpi myaccount R 1:23 2 bgn[1007,1017] bgnlogin1 $ ssh bgn1007 bgn1007 $ top bgn1007 $ nvidia-smi bgn1007 $ module load nvtop bgn1007 $ nvtop |
The GPU A100 shares the same slurm batch system with all partitions of System Lise.
#!/bin/bash #SBATCH --partition=gpu-a100 #SBATCH --nodes=2 #SBATCH --ntasks=8 #SBATCH --gres=gpu:4 module load openmpi/gcc.11/4.1.4 mpirun ./mycode.bin |