[gmx-users] problem extending simulation 64 proc

Berk Hess gmx3 at hotmail.com
Thu Mar 5 16:44:01 CET 2009


Hi,

I guess you are not aware that you are using 128 cores and you ask for 32 cores to do PME.
Using -npme 64 will probably result in a much higher performance, but you should check
the information printed at the end of the log file. If you PME load is really around 0.5, the real
PME load ratio (not the guess like grompp and mdrun print) can be found at the end of the log file,
you should decrease your PME load by increasing the cut-off and the grid spacing by the same amount.
This will provide higher performance.

I don't know what happens with tpbconv. The crash is surely not caused by something that happened
in your simulation.
You could try to use the -time option of tpbconv to select one of the last frames.

But in Gromacs 4 you should no longer use tpbconv.
You can use checkpoint files.
Simply make a tpr file with the complete runtime (you can use tpbconv -until or -nsteps without -f and -e),
and run mdrun with the options -cpi to read a checkpoint and -maxh to finish after a certain number of hours.

Berk

From: regafan at hotmail.com
To: gmx-users at gromacs.org
Date: Thu, 5 Mar 2009 15:21:08 +0000
Subject: [gmx-users] problem extending simulation 64 proc








Hello,
I have a problem in extending a MD simulation in Gromacs.
When I use 32 processors for the calculation, everything goes OK. The simulation finishes well and I can extend it with tpbconv.
However, I would like to increase the number of processors used up to 64. 
 
Using the same options for mdrun as I have used for 32 proc except for the number of processors:
 
srun  -n 64 /gpfs/apps/GROMACS/4.0.2/bin/mdrun -v -deffnm equilibrado9 -dlb auto
 
I get this error:
 
-------------------------------------------------------
Program mdrun, VERSION 4.0.2
Source code file: domdec_setup.c, line: 132
 
Fatal error:
Could not find an appropriate number of separate PME nodes. i.e. >= 0.474960*#nodes (58) and <= #nodes/2 (64) and reasonable performance wise (grid_x=384, grid_y=162).
Use the -npme option of mdrun or change the number of processors or the PME grid dimensions, see the manual for details.
 
 
When I changed to 
 
srun  -n 64 /gpfs/apps/GROMACS/4.0.2/bin/mdrun -v -deffnm equilibrado9 -dlb auto –npme 32
 
the calculation finished correctly. However, when I try to extend this simulation with tpbconv:
 
/gpfs/apps/GROMACS/4.0.4/bin/tpbconv -s equilibrado9.tpr -f equilibrado9.trr -e equilibrado9.edr -n index.ndx -o equilibrado10.tpr -extend 1600
 
The process dies, I don´t know why:
 
READING COORDS, VELS AND BOX FROM TRAJECTORY equilibrado9.trr...
 
Opened equilibrado9.edr as single precision energy file
trn version: GMX_trn_file (single precision)
Read    trr frame    452: step 2652000 time 5304.000Killed
 
When I used gmxcheck I have not found anything strange. I have also tried with the version 4.0.4 of Gromacs to do tpbconv, but the error is the same. This occurred in several calculations, always using 64 processors and -npme 32, so it is not a punctual error, something must be happening in the calculation but I don´t know what.
Does anybody has any idea?
 
Thank you very much for your help in advance,
 
Rebeca Garcia
Parc Cientific of Barcelona
regafan at hotmail.com
Nuevo Windows Live, un mundo lleno de posibilidades Descúbrelo.
_________________________________________________________________
See all the ways you can stay connected to friends and family
http://www.microsoft.com/windows/windowslive/default.aspx
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20090305/10a88437/attachment.html>


More information about the gromacs.org_gmx-users mailing list