Auszug
a versatile package to perform molecular dynamics for systems with hundreds to millions of particles.

...

GROMACS provides extremely high performance compared to all other programs.
GROMACS can make simultaneous use of both CPU and GPU available in a system. There are options to statically and dynamically balance the load between the different resources.
GROMACS is user-friendly, with topologies and parameter files written in clear text format.
Both run input files and trajectories are independent of hardware endian-ness, and can thus be read by any version GROMACS.
GROMACS comes with a large selection of flexible tools for trajectory analysis.
GROMACS can be run in parallel, using the standard MPI communication protocol.
GROMACS contains several state-of-the-art algorithms.
GROMACS is Free Software, available under the GNU Lesser General Public License (LGPL).

Weaknesses

GROMACS does not do to much further analysis to get very high simulation speed.
Sometimes it is challenging to get non-standard information about the simulated system.
Different versions sometimes have differences in default parameters/methods. Reproducing older version simulations with a newer version can be difficult.
Additional tools and utilities provided by GROMACS are sometimes not the top quality.

...

GROMACS automatically use any available GPUs. To achieve the best performance GROMACS uses both GPUs and CPUs in a reasonable balance.

QuickStart

Environment modules

...

Submission script examples

Simple CPU job script

A simple case of a GROMACS job using a total of 640 CPU cores for 12 hours. The requested amount of cores in the example does not include all available cores on the allocated nodes. The job will execute 92 ranks on 3 nodes + 91 ranks on 4 nodes. You can use this example if you know the exact amount of required ranks you want to use.

Codeblock

language	bash
linenumbers	true

#!/bin/bash
#SBATCH -t 12:00:00
#SBATCH -p standard96
#SBATCH -n 640

export SLURM_CPU_BIND=none

module load impi/2019.5
module load gromacs/2019.6

mpirun gmx_mpi mdrun MDRUNARGUMENTS

Whole node CPU job script

In case you want to use all cores on the allocated nodes, there are another options of the batch system to request the amount of nodes and number of tasks. The example below will result in running 672 ranks.

Codeblock

language	bash
linenumbers	true

#!/bin/bash
#SBATCH -t 12:00:00
#SBATCH -p standard96
#SBATCH -N 7
#SBATCH --tasks-per-node 96

export SLURM_CPU_BIND=none

module load impi/2019.5
module load gromacs/2019.6

mpirun gmx_mpi mdrun MDRUNARGUMENTS

GPU job script

Following script using four thread-MPI ranks. One is dedicated to the long-range PME calculation. Using the -gputasks 0001 keyword: the first 3 threads offload their short-range non-bonded calculations to the GPU with ID 0, the 4th (PME) thread offloads its calculations to the GPU with ID 1.

Codeblock

language	bash
linenumbers	true

#!/bin/bash 
#SBATCH --time=12:00:00
#SBATCH --partition=gpu-a100
#SBATCH --ntasks=72

export SLURM_CPU_BIND=none

module load gcc/11.3.0 intel/2023.0.0 cuda/11.8
module load gromacs/2023.0

export GMX_GPU_DD_COMMS=true
export GMX_GPU_PME_PP_COMMS=true

OMP_NUM_THREADS=9

gmx mdrun -ntomp 9 -ntmpi 4 -nb gpu -pme gpu -npme 1 -gputasks 0001 OTHER MDRUNARGUMENTS

Note: Settings of the Thread-MPI ranks and OpenMP threads is for achieve optimal performance. The number of ranks should be a multiple of the number of sockets, and the number of cores per node should be a multiple of the number of threads per rank.

Whole node GPU job script

To setup a whole node GPU job use the -gputasks keyword.

Codeblock

language	bash
linenumbers	true

#!/bin/bash 
#SBATCH --time=12:00:00
#SBATCH --partition=gpu-a100
#SBATCH --ntasks=72

export SLURM_CPU_BIND=none

module load gcc/11.3.0 intel/2023.0.0 cuda/11.8
module load gromacs/2023.0

export GMX_GPU_DD_COMMS=true
export GMX_GPU_PME_PP_COMMS=true

OMP_NUM_THREADS=9

gmx mdrun -ntomp 9 -ntmpi 16 -gputasks 0000111122223333 MDRUNARGUMENTS

Note: Settings of the Thread-MPI ranks and OpenMP threads is for achieve optimal performance. The number of ranks should be a multiple of the number of sockets, and the number of cores per node should be a multiple of the number of threads per rank.

Related Modules

Gromacs-Plumed

PLUMED is an open-source, community-developed library that provides a wide range of different methods, such as enhanced-sampling algorithms, free-energy methods and tools to analyze the vast amounts of data produced by molecular dynamics (MD) simulations. PLUMED works together with some of the most popular MD engines.

Gromacs/20XX.X-plumed modules are versions have been patched with PLUMED's modifications, and these versions are able to run meta-dynamics simulations.

Analyzing results

GROMACS Tools

...

More information about performance of the simulations and "how to imporve perfomance" can be find here.

Versionen im Vergleich

Alte Version 30

Neue Version 31

Schlüssel

QuickStart

Environment modules

Submission script examples

Simple CPU job script

Whole node CPU job script

GPU job script

Whole node GPU job script

Gromacs-Plumed

Analyzing results

GROMACS Tools

Useful links

References

Seitenvergleich

Versionen im Vergleich

Alte Version 30

Neue Version 31

Schlüssel

QuickStart

Environment modules

Submission script examples

Simple CPU job script

Whole node CPU job script

GPU job script

Whole node GPU job script

Gromacs-Plumed

Analyzing results

GROMACS Tools

Useful links

References