Compilation on CPU CLX

Compilation on CPU CLX

Content

Code execution

For examples for code execution, please visit Slurm partition CPU CLX.

Code compilation

Intel oneAPI compiler

module load intel icx -o hello.bin hello.c ifx -o hello.bin hello.f90 icpx -o hello.bin hello.cpp
module load intel icx -fopenmp -o hello.bin hello.c ifx -fopenmp -o hello.bin hello.f90 icpx -fopenmp -o hello.bin hello.cpp

GNU compiler

module load gcc gcc -o hello.bin hello.c gfortran -o hello.bin hello.f90 g++ -o hello.bin hello.cpp
module load gcc gcc -fopenmp -o hello.bin hello.c gfortran -fopenmp -o hello.bin hello.f90 g++ -fopenmp -o hello.bin hello.cpp

Slurm job script

The examples for slurm job scripts, e.g. myjobscipt.slurm, that cover the setup

  • 1 node,

  • 1 OpenMP code running.

#SBATCH --nodes=1 #SBATCH --partition=cpu-clx:test ./hello.bin
#SBATCH --nodes=1 #SBATCH --partition=cpu-clx:test export OMP_PROC_BIND=spread export OMP_NUM_THREADS=96 ./hello.bin
#SBATCH --nodes=1 #SBATCH --partition=cpu-clx:test export OMP_PROC_BIND=spread export OMP_NUM_THREADS=48 ./hello.bin
#SBATCH --nodes=1 #SBATCH --partition=cpu-clx:test export OMP_PROC_BIND=spread export OMP_NUM_THREADS=192 ./hello.bin

You can run different OpenMP codes at the same time. The examples cover the setup

  • 2 nodes,

  • 4 OpenMP codes run simultaneously.

  • The code is not MPI parallel. mpirun is used to start the codes only.

#SBATCH --nodes=2 #SBATCH --partition=cpu-clx:test module load impi/2019.5 export SLURM_CPU_BIND=none export OMP_PROC_BIND=spread export OMP_NUM_THREADS=48 mpirun -ppn 2 \ -np 1 ./code1.bin : -np 1 ./code2.bin : -np 1 ./code3.bin : -np 1 ./code4.bin
#SBATCH --nodes=2 #SBATCH --partition=standard96:test module load impi/2019.5 export SLURM_CPU_BIND=none export OMP_PROC_BIND=spread export OMP_NUM_THREADS=96 mpirun -ppn 2 \ -np 1 ./code1.bin : -np 1 ./code2.bin : -np 1 ./code3.bin : -np 1 ./code4.bin

Compiler flags

To make full use of the vectorizing capabilities of the Intel Cascade Lake CPUs, AVX-512 instructions and the 512-bit ZMM registers can be used with the following compile flags of the Intel compilers:

-xCORE-AVX512 -qopt-zmm-usage=high

However, high ZMM register usage is not recommended in all cases (read more).

With the GNU compilers, the corresponding compiler flags are

-march=cascadelake -mprefer-vector-width=512

Using the Intel MKL

The Intel® Math Kernel Library (Intel® MKL) is designed to run on multiple processors and operating systems. It is also compatible with several compilers and third party libraries, and provides different interfaces to the functionality. To support these different environments, tools, and interfaces, Intel MKL provides multiple libraries from which to choose.

Check out Intel's link line advisor to see what libraries are recommended for a particular use case.