[gmx-developers] Error in parrallel mdrun:" More than 8 graph edges per atom"
Igor Leontyev
ileontyev at ucdavis.edu
Sun Oct 26 01:32:27 CEST 2008
Hi, gromacs experts.
I have problem starting parallel simulations on 5 and more cpus:
-------------------------------------------------------
Program mdrun, VERSION 3.3.1
Source code file: mshift.c, line: 91
Fatal error:
More than 8 graph edges per atom (atom 950)
-------------------------------------------------------
Reported earlier solution "to increase the number 4 on the line 229 of
mshift.c" is associated with compilation process which is kind of
complicated for me. Moreover, the solution seems to be irrelevant for my
case since the problem disappears when simulation is started on 4 cpu. Do
someone has any idea how to avoid the problem in 16 cpu mdrun without
recompilation?
Bellow is log of the successful mdrun on 4 cpus:
################################################
:-) G R O M A C S (-:
Great Red Oystrich Makes All Chemists Sane
:-) VERSION 3.3.1 (-:
Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2006, The GROMACS development team,
check out http://www.gromacs.org for more information.
This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.
:-) /opt2/gromacs-3.3.1/bin/mdrun (-:
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
E. Lindahl and B. Hess and D. van der Spoel
GROMACS 3.0: A package for molecular simulation and trajectory analysis
J. Mol. Mod. 7 (2001) pp. 306-317
-------- -------- --- Thank You --- -------- --------
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
H. J. C. Berendsen, D. van der Spoel and R. van Drunen
GROMACS: A message-passing parallel molecular dynamics implementation
Comp. Phys. Comm. 91 (1995) pp. 43-56
-------- -------- --- Thank You --- -------- --------
CPU= 0, lastcg=11941, targetcg=23909, myshift= 3
CPU= 1, lastcg=15926, targetcg= 3959, myshift= 3
CPU= 2, lastcg=19910, targetcg= 7943, myshift= 2
CPU= 3, lastcg=23934, targetcg=11967, myshift= 2
nsb->shift = 3, nsb->bshift= 0
Listing Scalars
nsb->nodeid: 0
nsb->nnodes: 4
nsb->cgtotal: 23935
nsb->natoms: 47813
nsb->shift: 3
nsb->bshift: 0
Nodeid index homenr cgload workload
0 0 11952 11942 11942
1 11952 11955 15927 15927
2 23907 11952 19911 19911
3 35859 11954 23935 23935
parameters of the run:
integrator = md
nsteps = 1000
init_step = 0
ns_type = Grid
nstlist = 20
ndelta = 2
bDomDecomp = FALSE
decomp_dir = 0
nstcomm = 1003
comm_mode = Angular
nstcheckpoint = 1000
nstlog = 1000
nstxout = 100
nstvout = 100
nstfout = 0
nstenergy = 0
nstxtcout = 0
init_t = 0
delta_t = 0.002
xtcprec = 1000
nkx = 52
nky = 60
nkz = 91
pme_order = 4
ewald_rtol = 1e-05
ewald_geometry = 0
epsilon_surface = 0
optimize_fft = FALSE
ePBC = xyz
bUncStart = FALSE
bShakeSOR = FALSE
etc = Berendsen
epc = Berendsen
epctype = Isotropic
tau_p = 0.5
ref_p (3x3):
ref_p[ 0]={ 1.01325e+00, 0.00000e+00, 0.00000e+00}
ref_p[ 1]={ 0.00000e+00, 1.01325e+00, 0.00000e+00}
ref_p[ 2]={ 0.00000e+00, 0.00000e+00, 1.01325e+00}
compress (3x3):
compress[ 0]={ 4.50000e-05, 0.00000e+00, 0.00000e+00}
compress[ 1]={ 0.00000e+00, 4.50000e-05, 0.00000e+00}
compress[ 2]={ 0.00000e+00, 0.00000e+00, 4.50000e-05}
andersen_seed = 815131
rlist = 1
coulombtype = PME
rcoulomb_switch = 0
rcoulomb = 1
vdwtype = Cut-off
rvdw_switch = 0
rvdw = 1
epsilon_r = 1
epsilon_rf = 1
tabext = 1
gb_algorithm = Still
nstgbradii = 1
rgbradii = 2
gb_saltconc = 0
implicit_solvent = No
DispCorr = No
fudgeQQ = 0.8333
free_energy = no
init_lambda = 0
sc_alpha = 0
sc_power = 0
sc_sigma = 0.3
delta_lambda = 0
disre_weighting = Conservative
disre_mixed = FALSE
dr_fc = 1000
dr_tau = 0
nstdisreout = 100
orires_fc = 0
orires_tau = 0
nstorireout = 100
dihre-fc = 1000
dihre-tau = 0
nstdihreout = 100
em_stepsize = 0.01
em_tol = 10
niter = 20
fc_stepsize = 0
nstcgsteep = 1000
nbfgscorr = 10
ConstAlg = Lincs
shake_tol = 1e-04
lincs_order = 4
lincs_warnangle = 30
lincs_iter = 1
bd_fric = 0
ld_seed = 1993
cos_accel = 0
deform (3x3):
deform[ 0]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
deform[ 1]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
deform[ 2]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
userint1 = 0
userint2 = 0
userint3 = 0
userint4 = 0
userreal1 = 0
userreal2 = 0
userreal3 = 0
userreal4 = 0
grpopts:
nrdf: 101649
ref_t: 298.15
tau_t: 0.2
anneal: No
ann_npoints: 0
acc: 0 0 0
nfreeze: Y Y Y N N
N
energygrp_flags[ 0]: 0
efield-x:
n = 0
efield-xt:
n = 0
efield-y:
n = 0
efield-yt:
n = 0
efield-z:
n = 0
efield-zt:
n = 0
bQMMM = FALSE
QMconstraints = 0
QMMMscheme = 0
scalefactor = 1
qm_opts:
ngQM = 0
Max number of graph edges per atom is 6
Table routines are used for coulomb: TRUE
Table routines are used for vdw: FALSE
Using a Gaussian width (1/beta) of 0.320163 nm for Ewald
Cut-off's: NS: 1 Coulomb: 1 LJ: 1
System total charge: -0.000
Generated table with 1000 data points for Ewald.
Tabscale = 500 points/nm
Generated table with 1000 data points for LJ6.
Tabscale = 500 points/nm
Generated table with 1000 data points for LJ12.
Tabscale = 500 points/nm
Generated table with 500 data points for 1-4 COUL.
Tabscale = 500 points/nm
Generated table with 500 data points for 1-4 LJ6.
Tabscale = 500 points/nm
Generated table with 500 data points for 1-4 LJ12.
Tabscale = 500 points/nm
Enabling SPC water optimization for 11938 molecules.
Will do PME sum in reciprocal space.
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
U. Essman, L. Perela, M. L. Berkowitz, T. Darden, H. Lee and L. G. Pedersen
A smooth particle mesh Ewald method
J. Chem. Phys. 103 (1995) pp. 8577-8592
-------- -------- --- Thank You --- -------- --------
Parallelized PME sum used.
PARALLEL FFT DATA:
local_nx: 13 local_x_start: 0
local_ny_after_transpose: 15 local_y_start_after_transpose 0
Removing pbc first time
Done rmpbc
Center of mass motion removal mode is Angular
We have the following groups for center of mass motion removal:
0: rest, initial mass: 302278
There are: 11952 Atoms
Constraining the starting coordinates (step -2)
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
H. J. C. Berendsen, J. P. M. Postma, A. DiNola and J. R. Haak
Molecular dynamics with coupling to an external bath
J. Chem. Phys. 81 (1984) pp. 3684-3690
-------- -------- --- Thank You --- -------- --------
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
S. Miyamoto and P. A. Kollman
SETTLE: An Analytical Version of the SHAKE and RATTLE Algorithms for Rigid
Water Models
J. Comp. Chem. 13 (1992) pp. 952-962
-------- -------- --- Thank You --- -------- --------
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
B. Hess and H. Bekker and H. J. C. Berendsen and J. G. E. M. Fraaije
LINCS: A Linear Constraint Solver for molecular simulations
J. Comp. Chem. 18 (1997) pp. 1463-1472
-------- -------- --- Thank You --- -------- --------
Initializing LINear Constraint Solver
number of constraints is 5967
average number of constraints coupled to one constraint is 0.9
Rel. Constraint Deviation: Max between atoms RMS
Before LINCS 0.012234 6702 6703 0.003802
After LINCS 0.000005 7804 7806 0.000001
Going to use C-settle (4 waters)
wo = 0.888099, wh =0.0559503, wohh = 18.016, rc = 0.075695, ra = 0.00655606
rb = 0.0520322, rc2 = 0.15139, rone = 1, dHH = 0.15139, dOH = 0.09572
Constraining the coordinates at t0-dt (step -1)
Rel. Constraint Deviation: Max between atoms RMS
Before LINCS 0.001303 7535 7537 0.000163
After LINCS 0.000026 10529 10532 0.000003
Started mdrun on node 0 Sat Oct 25 17:35:08 2008
Initial temperature: 296.531 K
Step Time Lambda
0 0.00000 0.00000
Grid: 12 x 14 x 21 cells
Configuring nonbonded kernels...
Testing AMD 3DNow support... not present.
Testing ia32 SSE support... present.
Rel. Constraint Deviation: Max between atoms RMS
Before LINCS 0.061793 7445 7446 0.007589
After LINCS 0.000032 10529 10532 0.000003
Energies (kJ/mol)
Bond Angle Proper Dih. Ryckaert-Bell. LJ-14
9.42960e+03 2.53513e+04 1.38929e+03 2.94437e+04 1.19233e+04
Coulomb-14 LJ (SR) Coulomb (SR) Coul. recip. Potential
1.38451e+05 4.34924e+04 -5.72876e+05 -2.39230e+05 -5.52626e+05
Kinetic En. Total Energy Temperature Pressure (bar)
1.25294e+05 -4.27332e+05 2.96497e+02 -1.70726e+02
Step Time Lambda
1000 2.00000 0.00000
Rel. Constraint Deviation: Max between atoms RMS
Before LINCS 0.072407 7445 7446 0.007658
After LINCS 0.000032 3550 3551 0.000003
Energies (kJ/mol)
Bond Angle Proper Dih. Ryckaert-Bell. LJ-14
9.31967e+03 2.53570e+04 1.35254e+03 2.94476e+04 1.20265e+04
Coulomb-14 LJ (SR) Coulomb (SR) Coul. recip. Potential
1.38690e+05 4.33753e+04 -5.74055e+05 -2.39420e+05 -5.53906e+05
Kinetic En. Total Energy Temperature Pressure (bar)
1.26112e+05 -4.27794e+05 2.98434e+02 5.69840e+01
Total NODE time on node 0: 242.99
Average NODE time: 60.7475
Load imbalance reduced performance to 400% of max
<====== ############### ==>
<==== A V E R A G E S ====>
<== ############### ======>
Energies (kJ/mol)
Bond Angle Proper Dih. Ryckaert-Bell. LJ-14
9.28039e+03 2.53419e+04 1.35613e+03 2.93246e+04 1.19887e+04
Coulomb-14 LJ (SR) Coulomb (SR) Coul. recip. Potential
1.38392e+05 4.33318e+04 -5.73108e+05 -2.39245e+05 -5.53338e+05
Kinetic En. Total Energy Temperature Pressure (bar)
1.25976e+05 -4.27362e+05 2.98111e+02 -2.69801e+00
Box-X Box-Y Box-Z Volume Density (SI)
6.09850e+00 7.19869e+00 1.08997e+01 4.78511e+02 1.04899e+03
pV
-7.77757e+01
Total Virial (kJ/mol)
4.16629e+04 3.74690e+02 -3.38657e+02
3.74799e+02 4.14944e+04 1.74394e+02
-3.41842e+02 1.75904e+02 4.29355e+04
Pressure (bar)
2.41223e+01 -2.06918e+01 2.23237e+01
-2.06993e+01 4.74897e+01 -6.10837e+00
2.25447e+01 -6.21320e+00 -7.97060e+01
Total Dipole (Debye)
-6.55277e+02 -8.61214e+02 4.17082e+02
<====== ############################### ==>
<==== R M S - F L U C T U A T I O N S ====>
<== ############################### ======>
Energies (kJ/mol)
Bond Angle Proper Dih. Ryckaert-Bell. LJ-14
1.55015e+02 2.03651e+02 4.20929e+01 1.22827e+02 7.61767e+01
Coulomb-14 LJ (SR) Coulomb (SR) Coul. recip. Potential
1.81161e+02 5.82316e+02 7.17480e+02 9.44462e+01 4.40698e+02
Kinetic En. Total Energy Temperature Pressure (bar)
3.93428e+02 1.45874e+02 9.31012e-01 1.29635e+02
Box-X Box-Y Box-Z Volume Density (SI)
1.23222e-03 1.45343e-03 2.20161e-03 2.89950e-01 6.35630e-01
pV
3.73587e+03
Total Virial (kJ/mol)
2.75764e+03 1.92974e+03 2.11050e+03
1.93131e+03 2.94483e+03 2.05791e+03
2.11304e+03 2.05311e+03 3.31016e+03
Pressure (bar)
1.90896e+02 1.36925e+02 1.48015e+02
1.37042e+02 2.04299e+02 1.43707e+02
1.48169e+02 1.43367e+02 2.31551e+02
Total Dipole (Debye)
9.68668e+01 1.43086e+02 3.27665e+02
M E G A - F L O P S A C C O U N T I N G
Parallel run - timing based on wallclock.
RF=Reaction-Field FE=Free Energy SCFE=Soft-Core/Free Energy
T=Tabulated W3=SPC/TIP3p W4=TIP4p (single or pairs)
NF=No Forces
Computing: M-Number M-Flops % of Flops
-----------------------------------------------------------------------
Coul(T) 646.149375 27138.273750 4.2
Coul(T) [W3] 2.543824 317.978000 0.0
Coul(T) + LJ 2021.495476 111182.251180 17.3
Coul(T) + LJ [W3] 219.255641 30257.278458 4.7
Coul(T) + LJ [W3-W3] 730.391892 279009.702744 43.4
Outer nonbonded loop 173.047881 1730.478810 0.3
1,4 nonbonded interactions 31.503472 2835.312480 0.4
Spread Q Bspline 3063.092032 6126.184064 1.0
Gather F Bspline 3063.092032 36757.104384 5.7
3D-FFT 10296.774488 82374.195904 12.8
Solve PME 568.407840 36378.101760 5.7
NS-Pairs 394.401556 8282.432676 1.3
Reset In Box 2.438463 21.946167 0.0
Shift-X 95.603508 573.621048 0.1
CG-CoM 1.220685 35.399865 0.0
Sum Forces 143.582439 143.582439 0.0
Bonds 6.193187 266.307041 0.0
Angles 22.014993 3588.443859 0.6
Propers 2.389387 547.169623 0.1
RB-Dihedrals 25.411386 6276.612342 1.0
Virial 47.968921 863.440578 0.1
Update 47.860813 1483.685203 0.2
Stop-CM 0.047813 0.478130 0.0
P-Coupling 47.860813 287.164878 0.0
Calc-Ekin 47.908626 1293.532902 0.2
Lincs 5.984901 359.094060 0.1
Lincs-Mat 31.955580 127.822320 0.0
Constraint-V 47.860813 287.164878 0.0
Constraint-Vir 41.906343 1005.752232 0.2
Settle 11.973814 3867.541922 0.6
-----------------------------------------------------------------------
Total 643418.053697 100.0
-----------------------------------------------------------------------
NODE (s) Real (s) (%)
Time: 281.000 281.000 100.0
4:41
(Mnbf/s) (GFlops) (ns/day) (hour/ns)
Performance: 12.882 2.290 0.615 39.028
Detailed load balancing info in percentage of average
Type NODE: 0 1 2 3 Scaling
---------------------------------------
Coul(T):399 0 0 0 25%
Coul(T) [W3]: 0 36 106 256 38%
Coul(T) + LJ:399 0 0 0 25%
Coul(T) + LJ [W3]: 0 43 126 229 43%
Coul(T) + LJ [W3-W3]: 0 218 135 46 45%
Outer nonbonded loop:172 71 77 78 57%
1,4 nonbonded interactions:400 0 0 0 25%
Spread Q Bspline: 99 101 101 97 98%
Gather F Bspline: 99 101 101 97 98%
3D-FFT:100 100 100 100 100%
Solve PME:100 100 100 100 100%
NS-Pairs:266 56 43 32 37%
Reset In Box: 99 100 99 100 99%
Shift-X:100 100 100 99 99%
CG-CoM:199 66 66 67 50%
Sum Forces: 99 100 99 100 99%
Bonds:400 0 0 0 25%
Angles:400 0 0 0 25%
Propers:400 0 0 0 25%
RB-Dihedrals:400 0 0 0 25%
Virial: 99 100 99 100 99%
Update: 99 100 99 100 99%
Stop-CM: 99 100 99 100 99%
P-Coupling: 99 100 99 100 99%
Calc-Ekin: 99 100 99 100 99%
Lincs:400 0 0 0 25%
Lincs-Mat:400 0 0 0 25%
Constraint-V: 99 100 99 100 99%
Constraint-Vir: 57 114 114 113 87%
Settle: 0 133 133 132 74%
Total Force:132 133 89 43 74%
Total Shake: 38 120 120 120 82%
Total Scaling: 75% of max performance
Finished mdrun on node 0 Sat Oct 25 17:39:49 2008
More information about the gromacs.org_gmx-developers
mailing list