[gmx-users] problem extending simulation 64 proc

Berk Hess gmx3 at hotmail.com
Thu Mar 5 16:44:01 CET 2009


I guess you are not aware that you are using 128 cores and you ask for 32 cores to do PME.
Using -npme 64 will probably result in a much higher performance, but you should check
the information printed at the end of the log file. If you PME load is really around 0.5, the real
PME load ratio (not the guess like grompp and mdrun print) can be found at the end of the log file,
you should decrease your PME load by increasing the cut-off and the grid spacing by the same amount.
This will provide higher performance.

I don't know what happens with tpbconv. The crash is surely not caused by something that happened
in your simulation.
You could try to use the -time option of tpbconv to select one of the last frames.

But in Gromacs 4 you should no longer use tpbconv.
You can use checkpoint files.
Simply make a tpr file with the complete runtime (you can use tpbconv -until or -nsteps without -f and -e),
and run mdrun with the options -cpi to read a checkpoint and -maxh to finish after a certain number of hours.


From: regafan at hotmail.com
To: gmx-users at gromacs.org
Date: Thu, 5 Mar 2009 15:21:08 +0000
Subject: [gmx-users] problem extending simulation 64 proc

I have a problem in extending a MD simulation in Gromacs.
When I use 32 processors for the calculation, everything goes OK. The simulation finishes well and I can extend it with tpbconv.
However, I would like to increase the number of processors used up to 64. 
Using the same options for mdrun as I have used for 32 proc except for the number of processors:
srun  -n 64 /gpfs/apps/GROMACS/4.0.2/bin/mdrun -v -deffnm equilibrado9 -dlb auto
I get this error:
Program mdrun, VERSION 4.0.2
Source code file: domdec_setup.c, line: 132
Fatal error:
Could not find an appropriate number of separate PME nodes. i.e. >= 0.474960*#nodes (58) and <= #nodes/2 (64) and reasonable performance wise (grid_x=384, grid_y=162).
Use the -npme option of mdrun or change the number of processors or the PME grid dimensions, see the manual for details.
When I changed to 
srun  -n 64 /gpfs/apps/GROMACS/4.0.2/bin/mdrun -v -deffnm equilibrado9 -dlb auto –npme 32
the calculation finished correctly. However, when I try to extend this simulation with tpbconv:
/gpfs/apps/GROMACS/4.0.4/bin/tpbconv -s equilibrado9.tpr -f equilibrado9.trr -e equilibrado9.edr -n index.ndx -o equilibrado10.tpr -extend 1600
The process dies, I don´t know why:
Opened equilibrado9.edr as single precision energy file
trn version: GMX_trn_file (single precision)
Read    trr frame    452: step 2652000 time 5304.000Killed
When I used gmxcheck I have not found anything strange. I have also tried with the version 4.0.4 of Gromacs to do tpbconv, but the error is the same. This occurred in several calculations, always using 64 processors and -npme 32, so it is not a punctual error, something must be happening in the calculation but I don´t know what.
Does anybody has any idea?
Thank you very much for your help in advance,
Rebeca Garcia
Parc Cientific of Barcelona
regafan at hotmail.com
Nuevo Windows Live, un mundo lleno de posibilidades Descúbrelo.
See all the ways you can stay connected to friends and family
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20090305/10a88437/attachment.html>

More information about the gromacs.org_gmx-users mailing list