[gmx-developers] Problem with simulation in 8 nodes

Berk Hess hessb at mpip-mainz.mpg.de
Mon Aug 4 14:51:46 CEST 2008


Hi,

This looks like a bug.
Can you mail me the tpr file?

Berk.

Jose Duarte wrote:
> I'm running a simulation with the latest version of gromacs from CVS. 
> My protein is 90 residues long, I add waters and ions as usual and 
> then perform energy minimization, position restrained equilibration 
> and a molecular dynamics run. This all works perfectly fine on 1 cpu 
> (standard mdrun executable) on 4 and on 6 (using mdrun_mpi) but 
> misteriously fails on 8 cpus. I've tried this on several setups: using 
> lam-mpi in linux on a single multi-core box, using lam-mpi on several 
> nodes of a cluster, using open-mpi on a multi-core Mac. I'm always 
> getting exactly the same behaviour: all works fine on 4 or 6 cpus but 
> fails on 8. Gromacs is compiled with default parameters (single 
> precision).
>
> The problem itself comes in the energy minimization step. I run a 
> pretty standard EM with PME for electrostatic interactions. This is 
> the error message when running mdrun:
>
> ##########
> Making 2D domain decomposition 4 x 2 x 1
> Steepest Descents:
>   Tolerance (Fmax)   =  1.00000e+01
>   Number of steps    =         5000
>
> A list of missing interactions:
>            G96Angle of   1304 missing     -1
>         Proper Dih. of    510 missing     -2
>       Improper Dih. of    409 missing     -1
>
> -------------------------------------------------------
> Program mdrun_mpi, VERSION 3.3.99_development_20080718
> Source code file: domdec_top.c, line: 88
>
> Software inconsistency error:
> Some interactions seem to be assigned multiple times
>
> -------------------------------------------------------
>
> Error on node 3, will try to stop all the nodes
> Halting parallel program mdrun_mpi on CPU 3 out of 8
> ##########
>
>
>
> The one thing I notice different in this case compare to running on 4 
> or 6 cpus is that in those cases the domain decomposition is 1D 
> instead of 2D, no idea if that's relevant.
>
> Actually looking at the log file produced by mdrun the simulation 
> seems to run properly until step 403, after which this error is reported:
>
>
> ##########
> Not all bonded interactions have been properly assigned to the domain 
> decomposition cells
>
> A list of missing interactions:
>            G96Angle of   1304 missing     -1
>         Proper Dih. of    510 missing     -2
>       Improper Dih. of    409 missing     -1
> ##########
>
>
> I have also tried to run the same procedure on another protein but the 
> problem doesn't arise at all, so it seems to be related to that 
> particular protein. I can send the pdb file if that's helpful.
>
> Any ideas? Is this a bug?
>
> Thanks
>
> Jose
>
>
>
> Jose M. Duarte
> Max Planck Institute for Molecular Genetics
> Ihnestr. 63-73
> 14195 Berlin
> Germany
> _______________________________________________
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://www.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the www 
> interface or send it to gmx-developers-request at gromacs.org.
>
> This email was Anti Virus checked by Astaro Security Gateway. 
> http://www.astaro.com
>




More information about the gromacs.org_gmx-developers mailing list