Floating point exception with Intel MPI 2019.x using one task per node

Floating point exception with Intel MPI 2019.x using one task per node

Problem

When using Intel MPI 2019 (impi/2019.x) to start MPI jobs executing one task per node, the jobs aborts immediately with an error message

srun: error: gcnXXXX: task 0: Floating point exception srun: error: gcnYYYY: task 1: Floating point exception [...]

 

Solution

This is due to a problem with Intel's Hydra process manager. A workaround is to assign the environment variable I_MPI_HYDRA_TOPOLIB the value ipl (internal default value hwloc).

 

#!/bin/bash #SBATCH [...] export I_MPI_HYDRA_TOPOLIB=ipl [...]

 

Related articles