...
Codeblock |
---|
language | bash |
---|
title | Lise (using srun)For compute nodes with Rocky Linux 9 |
---|
|
#!/bin/bash
#SBATCH --time=12:00:00
#SBATCH --partition=cpu-clx
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=24
#SBATCH --cpus-per-task=4
#SBATCH --job-name=cp2k
export SLURM_CPU_BIND=none
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
# Binding OpenMP threads
export OMP_PLACES=cores
export OMP_PROC_BIND=close
# Binding MPI tasks
export I_MPI_PIN=yes
export I_MPI_PIN_DOMAIN=omp
export I_MPI_PIN_CELL=core
# Our tests have shown that CP2K has better performance with psm2 as libfabric provider
# Check if this also apply to your system
# To stick to the default provider, comment out the following line
export FI_PROVIDER=psm2
module load intel/2021.2 impi/2021.7.113 cp2k/20232024.21
srunmpirun cp2k.psmp input > output |
Codeblock |
---|
language | bash |
---|
title | Lise For compute nodes with CentOS 7 (using mpirun) |
---|
|
#!/bin/bash
#SBATCH --time=12:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=24
#SBATCH --cpus-per-task=4
#SBATCH --job-name=cp2k
export SLURM_CPU_BIND=none
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
# Binding OpenMP threads
export OMP_PLACES=cores
export OMP_PROC_BIND=close
# Binding MPI tasks
export I_MPI_PIN=yes
export I_MPI_PIN_DOMAIN=omp
export I_MPI_PIN_CELL=core
module load intel/2021.2 impi/2021.7.1 cp2k/2023.2
mpirun cp2k.psmp input > output |
...
Codeblock |
---|
language | bash |
---|
title | Lise For compute nodes with CentOS 7 (using mpirun): on srun) |
---|
|
#!/bin/bash
#SBATCH --time=12:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=24
#SBATCH --cpus-per-task=4
#SBATCH --job-name=cp2k
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
module load intel/2021.2 impi/2021.7.1 cp2k/2023.2
srun cp2k.psmp input > output |
Codeblock |
---|
language | bash |
---|
title | For Nvidia A100 GPU nodes |
---|
|
#!/bin/bash
#SBATCH --partition=gpu-a100
#SBATCH --time=12:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=4
#SBATCH --cpus-per-task=18
#SBATCH --job-name=cp2k
export SLURM_CPU_BIND=none
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
export OMP_PLACES=cores
export OMP_PROC_BIND=close
module load gcc/11.3.0 openmpi/gcc.11/4.1.4 cuda/11.8 cp2k/2023.2
# gpu_bind.sh (see the following script) should be placed inside the same directory where cp2k will be executed
# Don't forget to make gpu_bind.sh executable by running: chmod +x gpu_bind.sh
mpirun --bind-to core --map-by numa:PE=${SLURM_CPUS_PER_TASK} ./gpu_bind.sh cp2k.psmp input > output |
...
Codeblock |
---|
language | bash |
---|
title | gpu_bind.sh |
---|
|
#!/bin/bash
export CUDA_VISIBLE_DEVICES=$OMPI_COMM_WORLD_LOCAL_RANK
$@ |
HTML Kommentar |
---|
Commenting out this block, as Lise and Emmy have separate documentation pages now. Codeblock |
---|
language | bash |
---|
title | Emmy (using srun) |
---|
| #!/bin/bash
#SBATCH --time=12:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=24
#SBATCH --cpus-per-task=4
#SBATCH --job-name=cp2k
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
module load intel/2022.2 impi/2021.6 cp2k/2023.1
srun cp2k.psmp input > output |
|
Depending on the problem size, it may happen that the code stops may stop with a segmentation fault due to insufficient stack size or due to threads exceeding their stack space. To circumvent this, we recommend inserting in the jobscriptjob script:
Codeblock |
---|
|
export OMP_STACKSIZE=512M
ulimit -s unlimited |
...