[gmx-developers] Error in parrallel mdrun:" More than 8 graph edges per atom"

David van der Spoel spoel at xray.bmc.uu.se
Sun Oct 26 09:29:10 CET 2008


Igor Leontyev wrote:
> Hi, gromacs experts.
> I have problem starting parallel simulations on 5 and more cpus:

maybe not what you want to hear, but this version is not really 
supported anymore. This *might* have been fixed in 3.3.3, but your best 
bet for parallellization is version 4.0.


> -------------------------------------------------------
> Program mdrun, VERSION 3.3.1
> Source code file: mshift.c, line: 91
> Fatal error:
> More than 8 graph edges per atom (atom 950)
> ------------------------------------------------------- Reported earlier 
> solution "to increase the number 4 on the line 229 of mshift.c" is 
> associated with compilation process which is kind of complicated for me. 
> Moreover, the solution seems to be irrelevant for my case since the 
> problem disappears when simulation is started on 4 cpu. Do someone has 
> any idea how to avoid the problem in 16 cpu mdrun without recompilation?
>       Bellow is log of the successful mdrun on 4 cpus:
> 
> ################################################
> 
>                         :-)  G  R  O  M  A  C  S  (-:
> 
>                   Great Red Oystrich Makes All Chemists Sane
> 
>                            :-)  VERSION 3.3.1  (-:
> 
> 
>      Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
>       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
>             Copyright (c) 2001-2006, The GROMACS development team,
>            check out http://www.gromacs.org for more information.
> 
>         This program is free software; you can redistribute it and/or
>          modify it under the terms of the GNU General Public License
>         as published by the Free Software Foundation; either version 2
>             of the License, or (at your option) any later version.
> 
>                    :-)  /opt2/gromacs-3.3.1/bin/mdrun  (-:
> 
> 
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> E. Lindahl and B. Hess and D. van der Spoel
> GROMACS 3.0: A package for molecular simulation and trajectory analysis
> J. Mol. Mod. 7 (2001) pp. 306-317
> -------- -------- --- Thank You --- -------- --------
> 
> 
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> H. J. C. Berendsen, D. van der Spoel and R. van Drunen
> GROMACS: A message-passing parallel molecular dynamics implementation
> Comp. Phys. Comm. 91 (1995) pp. 43-56
> -------- -------- --- Thank You --- -------- --------
> 
> CPU=  0, lastcg=11941, targetcg=23909, myshift=    3
> CPU=  1, lastcg=15926, targetcg= 3959, myshift=    3
> CPU=  2, lastcg=19910, targetcg= 7943, myshift=    2
> CPU=  3, lastcg=23934, targetcg=11967, myshift=    2
> nsb->shift =   3, nsb->bshift=  0
> Listing Scalars
> nsb->nodeid:         0
> nsb->nnodes:      4
> nsb->cgtotal: 23935
> nsb->natoms:  47813
> nsb->shift:       3
> nsb->bshift:      0
> Nodeid   index  homenr  cgload  workload
>     0       0   11952   11942     11942
>     1   11952   11955   15927     15927
>     2   23907   11952   19911     19911
>     3   35859   11954   23935     23935
> 
> parameters of the run:
>   integrator           = md
>   nsteps               = 1000
>   init_step            = 0
>   ns_type              = Grid
>   nstlist              = 20
>   ndelta               = 2
>   bDomDecomp           = FALSE
>   decomp_dir           = 0
>   nstcomm              = 1003
>   comm_mode            = Angular
>   nstcheckpoint        = 1000
>   nstlog               = 1000
>   nstxout              = 100
>   nstvout              = 100
>   nstfout              = 0
>   nstenergy            = 0
>   nstxtcout            = 0
>   init_t               = 0
>   delta_t              = 0.002
>   xtcprec              = 1000
>   nkx                  = 52
>   nky                  = 60
>   nkz                  = 91
>   pme_order            = 4
>   ewald_rtol           = 1e-05
>   ewald_geometry       = 0
>   epsilon_surface      = 0
>   optimize_fft         = FALSE
>   ePBC                 = xyz
>   bUncStart            = FALSE
>   bShakeSOR            = FALSE
>   etc                  = Berendsen
>   epc                  = Berendsen
>   epctype              = Isotropic
>   tau_p                = 0.5
>   ref_p (3x3):
>      ref_p[    0]={ 1.01325e+00,  0.00000e+00,  0.00000e+00}
>      ref_p[    1]={ 0.00000e+00,  1.01325e+00,  0.00000e+00}
>      ref_p[    2]={ 0.00000e+00,  0.00000e+00,  1.01325e+00}
>   compress (3x3):
>      compress[    0]={ 4.50000e-05,  0.00000e+00,  0.00000e+00}
>      compress[    1]={ 0.00000e+00,  4.50000e-05,  0.00000e+00}
>      compress[    2]={ 0.00000e+00,  0.00000e+00,  4.50000e-05}
>   andersen_seed        = 815131
>   rlist                = 1
>   coulombtype          = PME
>   rcoulomb_switch      = 0
>   rcoulomb             = 1
>   vdwtype              = Cut-off
>   rvdw_switch          = 0
>   rvdw                 = 1
>   epsilon_r            = 1
>   epsilon_rf           = 1
>   tabext               = 1
>   gb_algorithm         = Still
>   nstgbradii           = 1
>   rgbradii             = 2
>   gb_saltconc          = 0
>   implicit_solvent     = No
>   DispCorr             = No
>   fudgeQQ              = 0.8333
>   free_energy          = no
>   init_lambda          = 0
>   sc_alpha             = 0
>   sc_power             = 0
>   sc_sigma             = 0.3
>   delta_lambda         = 0
>   disre_weighting      = Conservative
>   disre_mixed          = FALSE
>   dr_fc                = 1000
>   dr_tau               = 0
>   nstdisreout          = 100
>   orires_fc            = 0
>   orires_tau           = 0
>   nstorireout          = 100
>   dihre-fc             = 1000
>   dihre-tau            = 0
>   nstdihreout          = 100
>   em_stepsize          = 0.01
>   em_tol               = 10
>   niter                = 20
>   fc_stepsize          = 0
>   nstcgsteep           = 1000
>   nbfgscorr            = 10
>   ConstAlg             = Lincs
>   shake_tol            = 1e-04
>   lincs_order          = 4
>   lincs_warnangle      = 30
>   lincs_iter           = 1
>   bd_fric              = 0
>   ld_seed              = 1993
>   cos_accel            = 0
>   deform (3x3):
>      deform[    0]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
>      deform[    1]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
>      deform[    2]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
>   userint1             = 0
>   userint2             = 0
>   userint3             = 0
>   userint4             = 0
>   userreal1            = 0
>   userreal2            = 0
>   userreal3            = 0
>   userreal4            = 0
> grpopts:
>   nrdf:            101649
>   ref_t:           298.15
>   tau_t:              0.2
> anneal:                        No
> ann_npoints:             0
>   acc:                  0           0           0
>   nfreeze:           Y           Y           Y           N           N N
>   energygrp_flags[  0]: 0
>   efield-x:
>      n = 0
>   efield-xt:
>      n = 0
>   efield-y:
>      n = 0
>   efield-yt:
>      n = 0
>   efield-z:
>      n = 0
>   efield-zt:
>      n = 0
>   bQMMM                = FALSE
>   QMconstraints        = 0
>   QMMMscheme           = 0
>   scalefactor          = 1
> qm_opts:
>   ngQM                 = 0
> Max number of graph edges per atom is 6
> Table routines are used for coulomb: TRUE
> Table routines are used for vdw:     FALSE
> Using a Gaussian width (1/beta) of 0.320163 nm for Ewald
> Cut-off's:   NS: 1   Coulomb: 1   LJ: 1
> System total charge: -0.000
> Generated table with 1000 data points for Ewald.
> Tabscale = 500 points/nm
> Generated table with 1000 data points for LJ6.
> Tabscale = 500 points/nm
> Generated table with 1000 data points for LJ12.
> Tabscale = 500 points/nm
> Generated table with 500 data points for 1-4 COUL.
> Tabscale = 500 points/nm
> Generated table with 500 data points for 1-4 LJ6.
> Tabscale = 500 points/nm
> Generated table with 500 data points for 1-4 LJ12.
> Tabscale = 500 points/nm
> 
> Enabling SPC water optimization for 11938 molecules.
> 
> Will do PME sum in reciprocal space.
> 
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> U. Essman, L. Perela, M. L. Berkowitz, T. Darden, H. Lee and L. G. Pedersen
> A smooth particle mesh Ewald method
> J. Chem. Phys. 103 (1995) pp. 8577-8592
> -------- -------- --- Thank You --- -------- --------
> 
> Parallelized PME sum used.
> PARALLEL FFT DATA:
>   local_nx:                  13  local_x_start:                   0
>   local_ny_after_transpose:  15  local_y_start_after_transpose    0
> Removing pbc first time
> Done rmpbc
> Center of mass motion removal mode is Angular
> We have the following groups for center of mass motion removal:
>  0:  rest, initial mass: 302278
> There are: 11952 Atoms
> 
> Constraining the starting coordinates (step -2)
> 
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> H. J. C. Berendsen, J. P. M. Postma, A. DiNola and J. R. Haak
> Molecular dynamics with coupling to an external bath
> J. Chem. Phys. 81 (1984) pp. 3684-3690
> -------- -------- --- Thank You --- -------- --------
> 
> 
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> S. Miyamoto and P. A. Kollman
> SETTLE: An Analytical Version of the SHAKE and RATTLE Algorithms for Rigid
> Water Models
> J. Comp. Chem. 13 (1992) pp. 952-962
> -------- -------- --- Thank You --- -------- --------
> 
> 
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> B. Hess and H. Bekker and H. J. C. Berendsen and J. G. E. M. Fraaije
> LINCS: A Linear Constraint Solver for molecular simulations
> J. Comp. Chem. 18 (1997) pp. 1463-1472
> -------- -------- --- Thank You --- -------- --------
> 
> 
> Initializing LINear Constraint Solver
>  number of constraints is 5967
>  average number of constraints coupled to one constraint is 0.9
> 
>   Rel. Constraint Deviation:  Max    between atoms     RMS
>       Before LINCS         0.012234   6702   6703   0.003802
>        After LINCS         0.000005   7804   7806   0.000001
> 
> Going to use C-settle (4 waters)
> wo = 0.888099, wh =0.0559503, wohh = 18.016, rc = 0.075695, ra = 0.00655606
> rb = 0.0520322, rc2 = 0.15139, rone = 1, dHH = 0.15139, dOH = 0.09572
> 
> Constraining the coordinates at t0-dt (step -1)
>   Rel. Constraint Deviation:  Max    between atoms     RMS
>       Before LINCS         0.001303   7535   7537   0.000163
>        After LINCS         0.000026  10529  10532   0.000003
> 
> Started mdrun on node 0 Sat Oct 25 17:35:08 2008
> Initial temperature: 296.531 K
>           Step           Time         Lambda
>              0        0.00000        0.00000
> 
> Grid: 12 x 14 x 21 cells
> Configuring nonbonded kernels...
> Testing AMD 3DNow support... not present.
> Testing ia32 SSE support... present.
> 
> 
>   Rel. Constraint Deviation:  Max    between atoms     RMS
>       Before LINCS         0.061793   7445   7446   0.007589
>        After LINCS         0.000032  10529  10532   0.000003
> 
>   Energies (kJ/mol)
>           Bond          Angle    Proper Dih. Ryckaert-Bell.          LJ-14
>    9.42960e+03    2.53513e+04    1.38929e+03    2.94437e+04    1.19233e+04
>     Coulomb-14        LJ (SR)   Coulomb (SR)   Coul. recip.      Potential
>    1.38451e+05    4.34924e+04   -5.72876e+05   -2.39230e+05   -5.52626e+05
>    Kinetic En.   Total Energy    Temperature Pressure (bar)
>    1.25294e+05   -4.27332e+05    2.96497e+02   -1.70726e+02
> 
>           Step           Time         Lambda
>           1000        2.00000        0.00000
> 
>   Rel. Constraint Deviation:  Max    between atoms     RMS
>       Before LINCS         0.072407   7445   7446   0.007658
>        After LINCS         0.000032   3550   3551   0.000003
> 
>   Energies (kJ/mol)
>           Bond          Angle    Proper Dih. Ryckaert-Bell.          LJ-14
>    9.31967e+03    2.53570e+04    1.35254e+03    2.94476e+04    1.20265e+04
>     Coulomb-14        LJ (SR)   Coulomb (SR)   Coul. recip.      Potential
>    1.38690e+05    4.33753e+04   -5.74055e+05   -2.39420e+05   -5.53906e+05
>    Kinetic En.   Total Energy    Temperature Pressure (bar)
>    1.26112e+05   -4.27794e+05    2.98434e+02    5.69840e+01
> 
> 
> Total NODE time on node 0: 242.99
> Average NODE time: 60.7475
> Load imbalance reduced performance to 400% of max
>       <======  ###############  ==>
>       <====  A V E R A G E S  ====>
>       <==  ###############  ======>
> 
>   Energies (kJ/mol)
>           Bond          Angle    Proper Dih. Ryckaert-Bell.          LJ-14
>    9.28039e+03    2.53419e+04    1.35613e+03    2.93246e+04    1.19887e+04
>     Coulomb-14        LJ (SR)   Coulomb (SR)   Coul. recip.      Potential
>    1.38392e+05    4.33318e+04   -5.73108e+05   -2.39245e+05   -5.53338e+05
>    Kinetic En.   Total Energy    Temperature Pressure (bar)
>    1.25976e+05   -4.27362e+05    2.98111e+02   -2.69801e+00
> 
>          Box-X          Box-Y          Box-Z         Volume   Density (SI)
>    6.09850e+00    7.19869e+00    1.08997e+01    4.78511e+02    1.04899e+03
>             pV
>   -7.77757e+01
> 
>   Total Virial (kJ/mol)
>    4.16629e+04    3.74690e+02   -3.38657e+02
>    3.74799e+02    4.14944e+04    1.74394e+02
>   -3.41842e+02    1.75904e+02    4.29355e+04
> 
>   Pressure (bar)
>    2.41223e+01   -2.06918e+01    2.23237e+01
>   -2.06993e+01    4.74897e+01   -6.10837e+00
>    2.25447e+01   -6.21320e+00   -7.97060e+01
> 
>   Total Dipole (Debye)
>   -6.55277e+02   -8.61214e+02    4.17082e+02
> 
>       <======  ###############################  ==>
>       <====  R M S - F L U C T U A T I O N S  ====>
>       <==  ###############################  ======>
> 
>   Energies (kJ/mol)
>           Bond          Angle    Proper Dih. Ryckaert-Bell.          LJ-14
>    1.55015e+02    2.03651e+02    4.20929e+01    1.22827e+02    7.61767e+01
>     Coulomb-14        LJ (SR)   Coulomb (SR)   Coul. recip.      Potential
>    1.81161e+02    5.82316e+02    7.17480e+02    9.44462e+01    4.40698e+02
>    Kinetic En.   Total Energy    Temperature Pressure (bar)
>    3.93428e+02    1.45874e+02    9.31012e-01    1.29635e+02
> 
>          Box-X          Box-Y          Box-Z         Volume   Density (SI)
>    1.23222e-03    1.45343e-03    2.20161e-03    2.89950e-01    6.35630e-01
>             pV
>    3.73587e+03
> 
>   Total Virial (kJ/mol)
>    2.75764e+03    1.92974e+03    2.11050e+03
>    1.93131e+03    2.94483e+03    2.05791e+03
>    2.11304e+03    2.05311e+03    3.31016e+03
> 
>   Pressure (bar)
>    1.90896e+02    1.36925e+02    1.48015e+02
>    1.37042e+02    2.04299e+02    1.43707e+02
>    1.48169e+02    1.43367e+02    2.31551e+02
> 
>   Total Dipole (Debye)
>    9.68668e+01    1.43086e+02    3.27665e+02
> 
> 
>       M E G A - F L O P S   A C C O U N T I N G
> 
>       Parallel run - timing based on wallclock.
>   RF=Reaction-Field  FE=Free Energy  SCFE=Soft-Core/Free Energy
>   T=Tabulated        W3=SPC/TIP3p    W4=TIP4p (single or pairs)
>   NF=No Forces
> 
> Computing:                        M-Number         M-Flops  % of Flops
> -----------------------------------------------------------------------
> Coul(T)                         646.149375    27138.273750     4.2
> Coul(T) [W3]                      2.543824      317.978000     0.0
> Coul(T) + LJ                   2021.495476   111182.251180    17.3
> Coul(T) + LJ [W3]               219.255641    30257.278458     4.7
> Coul(T) + LJ [W3-W3]            730.391892   279009.702744    43.4
> Outer nonbonded loop            173.047881     1730.478810     0.3
> 1,4 nonbonded interactions       31.503472     2835.312480     0.4
> Spread Q Bspline               3063.092032     6126.184064     1.0
> Gather F Bspline               3063.092032    36757.104384     5.7
> 3D-FFT                        10296.774488    82374.195904    12.8
> Solve PME                       568.407840    36378.101760     5.7
> NS-Pairs                        394.401556     8282.432676     1.3
> Reset In Box                      2.438463       21.946167     0.0
> Shift-X                          95.603508      573.621048     0.1
> CG-CoM                            1.220685       35.399865     0.0
> Sum Forces                      143.582439      143.582439     0.0
> Bonds                             6.193187      266.307041     0.0
> Angles                           22.014993     3588.443859     0.6
> Propers                           2.389387      547.169623     0.1
> RB-Dihedrals                     25.411386     6276.612342     1.0
> Virial                           47.968921      863.440578     0.1
> Update                           47.860813     1483.685203     0.2
> Stop-CM                           0.047813        0.478130     0.0
> P-Coupling                       47.860813      287.164878     0.0
> Calc-Ekin                        47.908626     1293.532902     0.2
> Lincs                             5.984901      359.094060     0.1
> Lincs-Mat                        31.955580      127.822320     0.0
> Constraint-V                     47.860813      287.164878     0.0
> Constraint-Vir                   41.906343     1005.752232     0.2
> Settle                           11.973814     3867.541922     0.6
> -----------------------------------------------------------------------
> Total                                        643418.053697   100.0
> -----------------------------------------------------------------------
> 
>               NODE (s)   Real (s)      (%)
>       Time:    281.000    281.000    100.0
>                       4:41
>               (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
> Performance:     12.882      2.290      0.615     39.028
> 
> Detailed load balancing info in percentage of average
> Type        NODE:  0   1   2   3 Scaling
> ---------------------------------------
>        Coul(T):399   0   0   0     25%
>   Coul(T) [W3]:  0  36 106 256     38%
>   Coul(T) + LJ:399   0   0   0     25%
> Coul(T) + LJ [W3]:  0  43 126 229     43%
> Coul(T) + LJ [W3-W3]:  0 218 135  46     45%
> Outer nonbonded loop:172  71  77  78     57%
> 1,4 nonbonded interactions:400   0   0   0     25%
> Spread Q Bspline: 99 101 101  97     98%
> Gather F Bspline: 99 101 101  97     98%
>         3D-FFT:100 100 100 100    100%
>      Solve PME:100 100 100 100    100%
>       NS-Pairs:266  56  43  32     37%
>   Reset In Box: 99 100  99 100     99%
>        Shift-X:100 100 100  99     99%
>         CG-CoM:199  66  66  67     50%
>     Sum Forces: 99 100  99 100     99%
>          Bonds:400   0   0   0     25%
>         Angles:400   0   0   0     25%
>        Propers:400   0   0   0     25%
>   RB-Dihedrals:400   0   0   0     25%
>         Virial: 99 100  99 100     99%
>         Update: 99 100  99 100     99%
>        Stop-CM: 99 100  99 100     99%
>     P-Coupling: 99 100  99 100     99%
>      Calc-Ekin: 99 100  99 100     99%
>          Lincs:400   0   0   0     25%
>      Lincs-Mat:400   0   0   0     25%
>   Constraint-V: 99 100  99 100     99%
> Constraint-Vir: 57 114 114 113     87%
>         Settle:  0 133 133 132     74%
> 
>    Total Force:132 133  89  43     74%
> 
> 
>    Total Shake: 38 120 120 120     82%
> 
> 
> Total Scaling: 75% of max performance
> 
> Finished mdrun on node 0 Sat Oct 25 17:39:49 2008
> 
> 
> _______________________________________________
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://www.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the www 
> interface or send it to gmx-developers-request at gromacs.org.


-- 
David van der Spoel, Ph.D., Professor of Biology
Molec. Biophys. group, Dept. of Cell & Molec. Biol., Uppsala University.
Box 596, 75124 Uppsala, Sweden. Phone:	+46184714205. Fax: +4618511755.
spoel at xray.bmc.uu.se	spoel at gromacs.org   http://folding.bmc.uu.se



More information about the gromacs.org_gmx-developers mailing list