[gmx-developers] Error in parrallel mdrun:" More than 8 graph edges per atom"
David van der Spoel
spoel at xray.bmc.uu.se
Sun Oct 26 09:29:10 CET 2008
Igor Leontyev wrote:
> Hi, gromacs experts.
> I have problem starting parallel simulations on 5 and more cpus:
maybe not what you want to hear, but this version is not really
supported anymore. This *might* have been fixed in 3.3.3, but your best
bet for parallellization is version 4.0.
> -------------------------------------------------------
> Program mdrun, VERSION 3.3.1
> Source code file: mshift.c, line: 91
> Fatal error:
> More than 8 graph edges per atom (atom 950)
> ------------------------------------------------------- Reported earlier
> solution "to increase the number 4 on the line 229 of mshift.c" is
> associated with compilation process which is kind of complicated for me.
> Moreover, the solution seems to be irrelevant for my case since the
> problem disappears when simulation is started on 4 cpu. Do someone has
> any idea how to avoid the problem in 16 cpu mdrun without recompilation?
> Bellow is log of the successful mdrun on 4 cpus:
>
> ################################################
>
> :-) G R O M A C S (-:
>
> Great Red Oystrich Makes All Chemists Sane
>
> :-) VERSION 3.3.1 (-:
>
>
> Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
> Copyright (c) 1991-2000, University of Groningen, The Netherlands.
> Copyright (c) 2001-2006, The GROMACS development team,
> check out http://www.gromacs.org for more information.
>
> This program is free software; you can redistribute it and/or
> modify it under the terms of the GNU General Public License
> as published by the Free Software Foundation; either version 2
> of the License, or (at your option) any later version.
>
> :-) /opt2/gromacs-3.3.1/bin/mdrun (-:
>
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> E. Lindahl and B. Hess and D. van der Spoel
> GROMACS 3.0: A package for molecular simulation and trajectory analysis
> J. Mol. Mod. 7 (2001) pp. 306-317
> -------- -------- --- Thank You --- -------- --------
>
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> H. J. C. Berendsen, D. van der Spoel and R. van Drunen
> GROMACS: A message-passing parallel molecular dynamics implementation
> Comp. Phys. Comm. 91 (1995) pp. 43-56
> -------- -------- --- Thank You --- -------- --------
>
> CPU= 0, lastcg=11941, targetcg=23909, myshift= 3
> CPU= 1, lastcg=15926, targetcg= 3959, myshift= 3
> CPU= 2, lastcg=19910, targetcg= 7943, myshift= 2
> CPU= 3, lastcg=23934, targetcg=11967, myshift= 2
> nsb->shift = 3, nsb->bshift= 0
> Listing Scalars
> nsb->nodeid: 0
> nsb->nnodes: 4
> nsb->cgtotal: 23935
> nsb->natoms: 47813
> nsb->shift: 3
> nsb->bshift: 0
> Nodeid index homenr cgload workload
> 0 0 11952 11942 11942
> 1 11952 11955 15927 15927
> 2 23907 11952 19911 19911
> 3 35859 11954 23935 23935
>
> parameters of the run:
> integrator = md
> nsteps = 1000
> init_step = 0
> ns_type = Grid
> nstlist = 20
> ndelta = 2
> bDomDecomp = FALSE
> decomp_dir = 0
> nstcomm = 1003
> comm_mode = Angular
> nstcheckpoint = 1000
> nstlog = 1000
> nstxout = 100
> nstvout = 100
> nstfout = 0
> nstenergy = 0
> nstxtcout = 0
> init_t = 0
> delta_t = 0.002
> xtcprec = 1000
> nkx = 52
> nky = 60
> nkz = 91
> pme_order = 4
> ewald_rtol = 1e-05
> ewald_geometry = 0
> epsilon_surface = 0
> optimize_fft = FALSE
> ePBC = xyz
> bUncStart = FALSE
> bShakeSOR = FALSE
> etc = Berendsen
> epc = Berendsen
> epctype = Isotropic
> tau_p = 0.5
> ref_p (3x3):
> ref_p[ 0]={ 1.01325e+00, 0.00000e+00, 0.00000e+00}
> ref_p[ 1]={ 0.00000e+00, 1.01325e+00, 0.00000e+00}
> ref_p[ 2]={ 0.00000e+00, 0.00000e+00, 1.01325e+00}
> compress (3x3):
> compress[ 0]={ 4.50000e-05, 0.00000e+00, 0.00000e+00}
> compress[ 1]={ 0.00000e+00, 4.50000e-05, 0.00000e+00}
> compress[ 2]={ 0.00000e+00, 0.00000e+00, 4.50000e-05}
> andersen_seed = 815131
> rlist = 1
> coulombtype = PME
> rcoulomb_switch = 0
> rcoulomb = 1
> vdwtype = Cut-off
> rvdw_switch = 0
> rvdw = 1
> epsilon_r = 1
> epsilon_rf = 1
> tabext = 1
> gb_algorithm = Still
> nstgbradii = 1
> rgbradii = 2
> gb_saltconc = 0
> implicit_solvent = No
> DispCorr = No
> fudgeQQ = 0.8333
> free_energy = no
> init_lambda = 0
> sc_alpha = 0
> sc_power = 0
> sc_sigma = 0.3
> delta_lambda = 0
> disre_weighting = Conservative
> disre_mixed = FALSE
> dr_fc = 1000
> dr_tau = 0
> nstdisreout = 100
> orires_fc = 0
> orires_tau = 0
> nstorireout = 100
> dihre-fc = 1000
> dihre-tau = 0
> nstdihreout = 100
> em_stepsize = 0.01
> em_tol = 10
> niter = 20
> fc_stepsize = 0
> nstcgsteep = 1000
> nbfgscorr = 10
> ConstAlg = Lincs
> shake_tol = 1e-04
> lincs_order = 4
> lincs_warnangle = 30
> lincs_iter = 1
> bd_fric = 0
> ld_seed = 1993
> cos_accel = 0
> deform (3x3):
> deform[ 0]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
> deform[ 1]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
> deform[ 2]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
> userint1 = 0
> userint2 = 0
> userint3 = 0
> userint4 = 0
> userreal1 = 0
> userreal2 = 0
> userreal3 = 0
> userreal4 = 0
> grpopts:
> nrdf: 101649
> ref_t: 298.15
> tau_t: 0.2
> anneal: No
> ann_npoints: 0
> acc: 0 0 0
> nfreeze: Y Y Y N N N
> energygrp_flags[ 0]: 0
> efield-x:
> n = 0
> efield-xt:
> n = 0
> efield-y:
> n = 0
> efield-yt:
> n = 0
> efield-z:
> n = 0
> efield-zt:
> n = 0
> bQMMM = FALSE
> QMconstraints = 0
> QMMMscheme = 0
> scalefactor = 1
> qm_opts:
> ngQM = 0
> Max number of graph edges per atom is 6
> Table routines are used for coulomb: TRUE
> Table routines are used for vdw: FALSE
> Using a Gaussian width (1/beta) of 0.320163 nm for Ewald
> Cut-off's: NS: 1 Coulomb: 1 LJ: 1
> System total charge: -0.000
> Generated table with 1000 data points for Ewald.
> Tabscale = 500 points/nm
> Generated table with 1000 data points for LJ6.
> Tabscale = 500 points/nm
> Generated table with 1000 data points for LJ12.
> Tabscale = 500 points/nm
> Generated table with 500 data points for 1-4 COUL.
> Tabscale = 500 points/nm
> Generated table with 500 data points for 1-4 LJ6.
> Tabscale = 500 points/nm
> Generated table with 500 data points for 1-4 LJ12.
> Tabscale = 500 points/nm
>
> Enabling SPC water optimization for 11938 molecules.
>
> Will do PME sum in reciprocal space.
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> U. Essman, L. Perela, M. L. Berkowitz, T. Darden, H. Lee and L. G. Pedersen
> A smooth particle mesh Ewald method
> J. Chem. Phys. 103 (1995) pp. 8577-8592
> -------- -------- --- Thank You --- -------- --------
>
> Parallelized PME sum used.
> PARALLEL FFT DATA:
> local_nx: 13 local_x_start: 0
> local_ny_after_transpose: 15 local_y_start_after_transpose 0
> Removing pbc first time
> Done rmpbc
> Center of mass motion removal mode is Angular
> We have the following groups for center of mass motion removal:
> 0: rest, initial mass: 302278
> There are: 11952 Atoms
>
> Constraining the starting coordinates (step -2)
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> H. J. C. Berendsen, J. P. M. Postma, A. DiNola and J. R. Haak
> Molecular dynamics with coupling to an external bath
> J. Chem. Phys. 81 (1984) pp. 3684-3690
> -------- -------- --- Thank You --- -------- --------
>
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> S. Miyamoto and P. A. Kollman
> SETTLE: An Analytical Version of the SHAKE and RATTLE Algorithms for Rigid
> Water Models
> J. Comp. Chem. 13 (1992) pp. 952-962
> -------- -------- --- Thank You --- -------- --------
>
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> B. Hess and H. Bekker and H. J. C. Berendsen and J. G. E. M. Fraaije
> LINCS: A Linear Constraint Solver for molecular simulations
> J. Comp. Chem. 18 (1997) pp. 1463-1472
> -------- -------- --- Thank You --- -------- --------
>
>
> Initializing LINear Constraint Solver
> number of constraints is 5967
> average number of constraints coupled to one constraint is 0.9
>
> Rel. Constraint Deviation: Max between atoms RMS
> Before LINCS 0.012234 6702 6703 0.003802
> After LINCS 0.000005 7804 7806 0.000001
>
> Going to use C-settle (4 waters)
> wo = 0.888099, wh =0.0559503, wohh = 18.016, rc = 0.075695, ra = 0.00655606
> rb = 0.0520322, rc2 = 0.15139, rone = 1, dHH = 0.15139, dOH = 0.09572
>
> Constraining the coordinates at t0-dt (step -1)
> Rel. Constraint Deviation: Max between atoms RMS
> Before LINCS 0.001303 7535 7537 0.000163
> After LINCS 0.000026 10529 10532 0.000003
>
> Started mdrun on node 0 Sat Oct 25 17:35:08 2008
> Initial temperature: 296.531 K
> Step Time Lambda
> 0 0.00000 0.00000
>
> Grid: 12 x 14 x 21 cells
> Configuring nonbonded kernels...
> Testing AMD 3DNow support... not present.
> Testing ia32 SSE support... present.
>
>
> Rel. Constraint Deviation: Max between atoms RMS
> Before LINCS 0.061793 7445 7446 0.007589
> After LINCS 0.000032 10529 10532 0.000003
>
> Energies (kJ/mol)
> Bond Angle Proper Dih. Ryckaert-Bell. LJ-14
> 9.42960e+03 2.53513e+04 1.38929e+03 2.94437e+04 1.19233e+04
> Coulomb-14 LJ (SR) Coulomb (SR) Coul. recip. Potential
> 1.38451e+05 4.34924e+04 -5.72876e+05 -2.39230e+05 -5.52626e+05
> Kinetic En. Total Energy Temperature Pressure (bar)
> 1.25294e+05 -4.27332e+05 2.96497e+02 -1.70726e+02
>
> Step Time Lambda
> 1000 2.00000 0.00000
>
> Rel. Constraint Deviation: Max between atoms RMS
> Before LINCS 0.072407 7445 7446 0.007658
> After LINCS 0.000032 3550 3551 0.000003
>
> Energies (kJ/mol)
> Bond Angle Proper Dih. Ryckaert-Bell. LJ-14
> 9.31967e+03 2.53570e+04 1.35254e+03 2.94476e+04 1.20265e+04
> Coulomb-14 LJ (SR) Coulomb (SR) Coul. recip. Potential
> 1.38690e+05 4.33753e+04 -5.74055e+05 -2.39420e+05 -5.53906e+05
> Kinetic En. Total Energy Temperature Pressure (bar)
> 1.26112e+05 -4.27794e+05 2.98434e+02 5.69840e+01
>
>
> Total NODE time on node 0: 242.99
> Average NODE time: 60.7475
> Load imbalance reduced performance to 400% of max
> <====== ############### ==>
> <==== A V E R A G E S ====>
> <== ############### ======>
>
> Energies (kJ/mol)
> Bond Angle Proper Dih. Ryckaert-Bell. LJ-14
> 9.28039e+03 2.53419e+04 1.35613e+03 2.93246e+04 1.19887e+04
> Coulomb-14 LJ (SR) Coulomb (SR) Coul. recip. Potential
> 1.38392e+05 4.33318e+04 -5.73108e+05 -2.39245e+05 -5.53338e+05
> Kinetic En. Total Energy Temperature Pressure (bar)
> 1.25976e+05 -4.27362e+05 2.98111e+02 -2.69801e+00
>
> Box-X Box-Y Box-Z Volume Density (SI)
> 6.09850e+00 7.19869e+00 1.08997e+01 4.78511e+02 1.04899e+03
> pV
> -7.77757e+01
>
> Total Virial (kJ/mol)
> 4.16629e+04 3.74690e+02 -3.38657e+02
> 3.74799e+02 4.14944e+04 1.74394e+02
> -3.41842e+02 1.75904e+02 4.29355e+04
>
> Pressure (bar)
> 2.41223e+01 -2.06918e+01 2.23237e+01
> -2.06993e+01 4.74897e+01 -6.10837e+00
> 2.25447e+01 -6.21320e+00 -7.97060e+01
>
> Total Dipole (Debye)
> -6.55277e+02 -8.61214e+02 4.17082e+02
>
> <====== ############################### ==>
> <==== R M S - F L U C T U A T I O N S ====>
> <== ############################### ======>
>
> Energies (kJ/mol)
> Bond Angle Proper Dih. Ryckaert-Bell. LJ-14
> 1.55015e+02 2.03651e+02 4.20929e+01 1.22827e+02 7.61767e+01
> Coulomb-14 LJ (SR) Coulomb (SR) Coul. recip. Potential
> 1.81161e+02 5.82316e+02 7.17480e+02 9.44462e+01 4.40698e+02
> Kinetic En. Total Energy Temperature Pressure (bar)
> 3.93428e+02 1.45874e+02 9.31012e-01 1.29635e+02
>
> Box-X Box-Y Box-Z Volume Density (SI)
> 1.23222e-03 1.45343e-03 2.20161e-03 2.89950e-01 6.35630e-01
> pV
> 3.73587e+03
>
> Total Virial (kJ/mol)
> 2.75764e+03 1.92974e+03 2.11050e+03
> 1.93131e+03 2.94483e+03 2.05791e+03
> 2.11304e+03 2.05311e+03 3.31016e+03
>
> Pressure (bar)
> 1.90896e+02 1.36925e+02 1.48015e+02
> 1.37042e+02 2.04299e+02 1.43707e+02
> 1.48169e+02 1.43367e+02 2.31551e+02
>
> Total Dipole (Debye)
> 9.68668e+01 1.43086e+02 3.27665e+02
>
>
> M E G A - F L O P S A C C O U N T I N G
>
> Parallel run - timing based on wallclock.
> RF=Reaction-Field FE=Free Energy SCFE=Soft-Core/Free Energy
> T=Tabulated W3=SPC/TIP3p W4=TIP4p (single or pairs)
> NF=No Forces
>
> Computing: M-Number M-Flops % of Flops
> -----------------------------------------------------------------------
> Coul(T) 646.149375 27138.273750 4.2
> Coul(T) [W3] 2.543824 317.978000 0.0
> Coul(T) + LJ 2021.495476 111182.251180 17.3
> Coul(T) + LJ [W3] 219.255641 30257.278458 4.7
> Coul(T) + LJ [W3-W3] 730.391892 279009.702744 43.4
> Outer nonbonded loop 173.047881 1730.478810 0.3
> 1,4 nonbonded interactions 31.503472 2835.312480 0.4
> Spread Q Bspline 3063.092032 6126.184064 1.0
> Gather F Bspline 3063.092032 36757.104384 5.7
> 3D-FFT 10296.774488 82374.195904 12.8
> Solve PME 568.407840 36378.101760 5.7
> NS-Pairs 394.401556 8282.432676 1.3
> Reset In Box 2.438463 21.946167 0.0
> Shift-X 95.603508 573.621048 0.1
> CG-CoM 1.220685 35.399865 0.0
> Sum Forces 143.582439 143.582439 0.0
> Bonds 6.193187 266.307041 0.0
> Angles 22.014993 3588.443859 0.6
> Propers 2.389387 547.169623 0.1
> RB-Dihedrals 25.411386 6276.612342 1.0
> Virial 47.968921 863.440578 0.1
> Update 47.860813 1483.685203 0.2
> Stop-CM 0.047813 0.478130 0.0
> P-Coupling 47.860813 287.164878 0.0
> Calc-Ekin 47.908626 1293.532902 0.2
> Lincs 5.984901 359.094060 0.1
> Lincs-Mat 31.955580 127.822320 0.0
> Constraint-V 47.860813 287.164878 0.0
> Constraint-Vir 41.906343 1005.752232 0.2
> Settle 11.973814 3867.541922 0.6
> -----------------------------------------------------------------------
> Total 643418.053697 100.0
> -----------------------------------------------------------------------
>
> NODE (s) Real (s) (%)
> Time: 281.000 281.000 100.0
> 4:41
> (Mnbf/s) (GFlops) (ns/day) (hour/ns)
> Performance: 12.882 2.290 0.615 39.028
>
> Detailed load balancing info in percentage of average
> Type NODE: 0 1 2 3 Scaling
> ---------------------------------------
> Coul(T):399 0 0 0 25%
> Coul(T) [W3]: 0 36 106 256 38%
> Coul(T) + LJ:399 0 0 0 25%
> Coul(T) + LJ [W3]: 0 43 126 229 43%
> Coul(T) + LJ [W3-W3]: 0 218 135 46 45%
> Outer nonbonded loop:172 71 77 78 57%
> 1,4 nonbonded interactions:400 0 0 0 25%
> Spread Q Bspline: 99 101 101 97 98%
> Gather F Bspline: 99 101 101 97 98%
> 3D-FFT:100 100 100 100 100%
> Solve PME:100 100 100 100 100%
> NS-Pairs:266 56 43 32 37%
> Reset In Box: 99 100 99 100 99%
> Shift-X:100 100 100 99 99%
> CG-CoM:199 66 66 67 50%
> Sum Forces: 99 100 99 100 99%
> Bonds:400 0 0 0 25%
> Angles:400 0 0 0 25%
> Propers:400 0 0 0 25%
> RB-Dihedrals:400 0 0 0 25%
> Virial: 99 100 99 100 99%
> Update: 99 100 99 100 99%
> Stop-CM: 99 100 99 100 99%
> P-Coupling: 99 100 99 100 99%
> Calc-Ekin: 99 100 99 100 99%
> Lincs:400 0 0 0 25%
> Lincs-Mat:400 0 0 0 25%
> Constraint-V: 99 100 99 100 99%
> Constraint-Vir: 57 114 114 113 87%
> Settle: 0 133 133 132 74%
>
> Total Force:132 133 89 43 74%
>
>
> Total Shake: 38 120 120 120 82%
>
>
> Total Scaling: 75% of max performance
>
> Finished mdrun on node 0 Sat Oct 25 17:39:49 2008
>
>
> _______________________________________________
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://www.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the www
> interface or send it to gmx-developers-request at gromacs.org.
--
David van der Spoel, Ph.D., Professor of Biology
Molec. Biophys. group, Dept. of Cell & Molec. Biol., Uppsala University.
Box 596, 75124 Uppsala, Sweden. Phone: +46184714205. Fax: +4618511755.
spoel at xray.bmc.uu.se spoel at gromacs.org http://folding.bmc.uu.se
More information about the gromacs.org_gmx-developers
mailing list