CP2K

CP2K

Description

CP2K is a package for atomistic simulations of solid state, liquid, molecular, and biological systems offering a wide range of computational methods with the mixed Gaussian and plane waves approaches.

More information about CP2K and the documentation are found on https://www.cp2k.org/

Availability

CP2K is freely available for all users under the GNU General Public License (GPL).

Modules

CP2K is an MPI-parallel application. You can use either mpirun or srun as the job starter for CP2K. If you opt for mpirun, then, apart from loading the corresponding impi or openmpi modules, CPU and/or GPU pinning should be carefully carried out.

CP2K Version

Modulefile

Requirement

Compute Partitions

Support

CPU/GPU

CP2K Version

Modulefile

Requirement

Compute Partitions

Support

CPU/GPU

7.1

cp2k/7.1

impi/2021.13

CPU CLX

omp libint fftw3 libxc elpa parallel mpi3 scalapack xsmm spglib mkl

/

2022.2

cp2k/2022.2

intel/2021.2 (Lise)
intel/2022.2 (Emmy)

CentOS 7

libint, fftw3, libxc, elpa, scalapack, cosma, xsmm, spglib, mkl, sirius, libvori and libbqb

/

2023.1

cp2k/2023.1

intel/2021.2 (Lise)

intel/2022.2 (Emmy)

CentOS 7

Lise: libint, fftw3, libxc, elpa, scalapack, cosma, xsmm, spglib, mkl, sirius, libvori and libbqb.

Emmy: libint, fftw3, libxc, elpa, scalapack, cosma, xsmm, spglib, mkl and sirius.

/

2023.1

cp2k/2023.1

openmpi/gcc.11/4.1.4

cuda/11.8

GPU A100

libint, fftw3, libxc, elpa, elpa_nvidia_gpu, scalapack, cosma, xsmm, dbcsr_acc, spglib,

mkl, sirius, offload_cuda, spla_gemm, m_offloading, libvdwxc

/

2023.2

cp2k/2023.2

intel/2021.2

impi/2021.7.1

CentOS 7

libint, fftw3, libxc, elpa, scalapack, cosma, xsmm, spglib, mkl, sirius, libvori and libbqb

/

2023.2

cp2k/2023.2

openmpi/gcc.11/4.1.4

cuda/11.8

GPU A100

libint, fftw3, libxc, elpa, elpa_nvidia_gpu, scalapack, cosma, xsmm, dbcsr_acc, spglib,

mkl, sirius, offload_cuda, spla_gemm, m_offloading, libvdwxc

/

2024.1

cp2k/2024.1

impi/2021.13

CPU CLX

omp,libint,fftw3,fftw3_mkl,libxc,elpa,parallel,mpi_f08,scalapack,xsmm,spglib,mkl,sirius,hdf5

/

2024.1

cp2k/2024.1

openmpi/gcc/5.0.3

CPU Genoa

omp,fftw3,libxc,elpa,parallel,mpi_f08,scalapack,cosma,xsmm,spglib,sirius,hdf5

/

2025.1

cp2k/2025.1

impi/2021.14

CPU CLX

omp,libint,fftw3,fftw3_mkl,libxc,elpa,parallel,scalapack,mpi_f08,cosma,xsmm,spglib,mkl,libdftd4,sirius,hdf5

/

2025.1

cp2k/2025.1

openmpi/gcc/5.0.3

CPU Genoa

omp,libint,fftw3,libxc,elpa,parallel,mpi_f08,scalapack,cosma,xsmm,spglib,sirius,hdf5

/

2025.2

cp2k/2025.1

openmpi/gcc/5.0.3

CPU Genoa

omp, libint, fftw3, libxc, elpa, parallel, scalapack, mpi_f08, xsmm, plumed2, spglib, libdftd4

/

Remark: cp2k needs special attention when running on GPUs.

  1. You need to check if, for your problem, a considerable acceleration is expected. E.g., for the following test cases, a performance degradation has been reported: https://www.cp2k.org/performance:piz-daint-h2o-64, https://www.cp2k.org/performance:piz-daint-h2o-64-ri-mp2, https://www.cp2k.org/performance:piz-daint-lih-hfx, https://www.cp2k.org/performance:piz-daint-fayalite-fist

  2. GPU pinning is required (see the example of a job script below). Don't forget to make executable the script that takes care of the GPU pinning. In the example, this is achieved with:

chmod +x gpu_bind.sh

Using CP2K as a library

Starting from version 2023.2, CP2K has been compiled enabling the option that allows it to be used as a library: libcp2k.a can be found inside $CP2K_LIB_DIR. The header libcp2k.h is located in $CP2K_HEADER_DIR, and the module files (.mod), possibly needed by Fortran users, are in $CP2K_MOD_DIR.

For more details, please refer to the documentation.

Example Jobscripts

For compute nodes on CPU CLX
#!/bin/bash #SBATCH --time=12:00:00 #SBATCH --partition=cpu-clx #SBATCH --nodes=1 #SBATCH --ntasks-per-node=24 #SBATCH --cpus-per-task=4 #SBATCH --job-name=cp2k export SLURM_CPU_BIND=none export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} # Binding OpenMP threads export OMP_PLACES=cores export OMP_PROC_BIND=close # Binding MPI tasks export I_MPI_PIN=yes export I_MPI_PIN_DOMAIN=omp export I_MPI_PIN_CELL=core # Our tests have shown that CP2K has better performance with psm2 as libfabric provider # Check if this also apply to your system # To stick to the default provider, comment out the following line export FI_PROVIDER=psm2 # Select the appropriate Intel-MPI module module load impi/2021.14 # Select the appropriate version module load cp2k/2025.1 mpirun cp2k.psmp input > output
For compute nodes on CPU Genoa
#!/bin/bash #SBATCH --time=12:00:00 #SBATCH --partition=cpu-genoa #SBATCH --nodes=1 #SBATCH --ntasks-per-node=48 #SBATCH --cpus-per-task=4 #SBATCH --job-name=cp2k export SLURM_CPU_BIND=none export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} # Binding OpenMP threads export OMP_PLACES=cores export OMP_PROC_BIND=close module load openmpi/gcc/5.0.3 # Select the appropriate version module load cp2k/2024.1 # Do not use srun combined with export SLURM_CPU_BIND=none # Important: here we are using mpirun to start the MPI process. The pinning is performed according to the following line mpirun --bind-to core --map-by ppr:${SLURM_NTASKS_PER_NODE}:node:pe=${OMP_NUM_THREADS} cp2k.psmp input > output
For Nvidia A100 GPU nodes
#!/bin/bash #SBATCH --partition=gpu-a100 #SBATCH --time=12:00:00 #SBATCH --nodes=1 #SBATCH --ntasks-per-node=4 #SBATCH --cpus-per-task=18 #SBATCH --job-name=cp2k export SLURM_CPU_BIND=none export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}   export OMP_PLACES=cores export OMP_PROC_BIND=close module load gcc/11.3.0 openmpi/gcc.11/4.1.4 cuda/11.8 cp2k/2023.2 # gpu_bind.sh (see the following script) should be placed inside the same directory where cp2k will be executed # Don't forget to make gpu_bind.sh executable by running: chmod +x gpu_bind.sh mpirun --bind-to core --map-by numa:PE=${SLURM_CPUS_PER_TASK} ./gpu_bind.sh cp2k.psmp input > output
gpu_bind.sh
#!/bin/bash export CUDA_VISIBLE_DEVICES=$OMPI_COMM_WORLD_LOCAL_RANK $@

Remark on OpenMP

Depending on the problem size, the code may terminate with a segmentation fault due to an insufficient stack size or because threads exceed their allocated stack space. To circumvent this, we recommend inserting the following in the job script:

export OMP_STACKSIZE=512M ulimit -s unlimited