[gmx-users] Multi-node GPU runs crashing with a fork() warning
Thomas C. O'Connor
toconnor at jhu.edu
Thu May 22 00:14:02 CEST 2014
Hey Folks,
I'm attempting to run simulations on a multi-node gpu cluster and my
simulations are crashing after flagging a open-mpi fork() warning:
*------------------------------------------------------------------------------------------*
*An MPI process has executed an operation involving a call to the*
*"fork()" system call to create a child process. Open MPI is currently*
*operating in a condition that could result in memory corruption or*
*other system errors; your MPI job may hang, crash, or produce silent*
*data corruption. The use of fork() (or system() or other calls that*
*create child processes) is strongly discouraged.*
*The process that invoked fork was:*
* Local host: lngpu019 (PID 11549)*
* MPI_COMM_WORLD rank: 18*
*If you are *absolutely sure* that your application will successfully*
*and correctly survive a call to fork(), you may disable this warning*
*by setting the mpi_warn_on_fork MCA parameter to 0.*
*------------------------------------------------------------------------------------------*
I saw a similar mailing-list post about this sort of issue from September
2013, but the thread had no resolution.
- Each node of our cluster has has 12 intel cores and 6 NVIDIA Tesla
C2050 GPU's.
- we call: mpirun -machinefile nodes.txt -npernode 6 mdrun_mpi
- I compiled GROMACS on one of the compute nodes with the C2050's.
We also have a few nodes with newer K20 NVIDIA GPU's. When we compile
GROMACS on these nodes we can run the code across multiple nodes and GPU's
without any errors.
I don't know if the fork() error is directly related to the crash or not;
or if there might be obscure, device specific object files outside my build
directory, that I should delete. Any insight you folks could provide to
help me solve this issue would be appreciated.
Thanks,
--
Thomas O'Connor
Graduate Research Assistant
MCS IGERT Fellow
Department of Physics & Astronomy
The Johns Hopkins University
3701 San Martin Drive
Baltimore, MD 21218*toconnor at jhu.edu <toconnor at jhu.edu>*410.516.8587
More information about the gromacs.org_gmx-users
mailing list