[gmx-users] error in running systems in parallel

Justin Lemkul jalemkul at vt.edu
Thu May 1 11:29:35 CEST 2014



On 5/1/14, 2:23 AM, Balasubramani GL wrote:
>
>
> Hai Friends,
>
>   I have individual input files and directories for my system, i want to
> perform REMD(replica exchange MD). when i execute multiple simulation
> run mdrun_mpi, gets a error like this:
>
> I have individual directories called equil0, equil1, equil2, equil3
> which contains input/output files under main directory called stage1 ,
> when i run this command:
>
> $ mpirun -np 160 mdrun_mpi -v -multidir equil[0123]
>
> error pops up:
>
> NNODES=160, MYRANK=159, HOSTNAME=structure
> NODEID=159 argc=7
>
> Back Off! I just backed up md.log to ./#md.log.2#
> Getting Loaded...
> Reading file topol.tpr, VERSION 4.6.1 (single precision)
>
> -------------------------------------------------------
> Program mdrun_mpi, VERSION 4.5.5
> Source code file: tpxio.c, line: 2010
>
> Fatal error:
> reading tpx file (topol.tpr) version 83 with version 73 program
> For more information and tips for troubleshooting, please check the
> GROMACS
> website at http://www.gromacs.org/Documentation/Errors
> -------------------------------------------------------
>
> "Alas, You're Welcome" (Prof. Dumbledore in Potter Puppet Pals)
>
> Error on node 0, will try to stop all the nodes
> Halting parallel program mdrun_mpi on CPU 0 out of 160
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
> with errorcode -1.
>
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --------------------------------------------------------------------------
>
> gcq#322: "Alas, You're Welcome" (Prof. Dumbledore in Potter Puppet Pals)
>
> --------------------------------------------------------------------------
> mpirun has exited due to process rank 0 with PID 127928 on
> node structure exiting without calling "finalize". This may
> have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here).
>
> can somebody suggest me how to fix this error. Your help is greatly
> appreciated.
>

The .tpr files were created with a newer version of Gromacs than the mdrun 
version on the cluster.  Use a consistent version.

-Justin

-- 
==================================================

Justin A. Lemkul, Ph.D.
Ruth L. Kirschstein NRSA Postdoctoral Fellow

Department of Pharmaceutical Sciences
School of Pharmacy
Health Sciences Facility II, Room 601
University of Maryland, Baltimore
20 Penn St.
Baltimore, MD 21201

jalemkul at outerbanks.umaryland.edu | (410) 706-7441
http://mackerell.umaryland.edu/~jalemkul

==================================================


More information about the gromacs.org_gmx-users mailing list