[gmx-users] gromacs-4.0.5 parallel run in 8 cpu: slow speed
Mark Abraham
mark.abraham at anu.edu.au
Thu Jun 11 15:56:15 CEST 2009
On 06/11/09, Thamu <asthamu at gmail.com> wrote:
>
> Hi Mark,
>
> The top md.log is below. The mdrun command was "mpirun -np 8 ~/software/bin/mdrun_mpi -deffnm md"
In my experience, correctly-configured MPI gromacs running in parallel reports information about the number of nodes and the identity of the node writing the .log file. This is missing, so something is wrong with your setup.
I've assumed that you've compared this "8-processor" runtime with a single-processor runtime and found them comparable...
Mark
>
>
>
> :-) G R O M A C S (-:
>
> GROup of MAchos and Cynical Suckers
>
> :-) VERSION 4.0.5 (-:
>
>
> Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
> Copyright (c) 1991-2000, University of Groningen, The Netherlands.
> Copyright (c) 2001-2008, The GROMACS development team,
> check out http://www.gromacs.org (http://www.gromacs.org) for more information.
>
> This program is free software; you can redistribute it and/or
> modify it under the terms of the GNU General Public License
> as published by the Free Software Foundation; either version 2
> of the License, or (at your option) any later version.
>
> :-) /home/thamu/software/bin/mdrun_mpi (-:
>
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> B. Hess and C. Kutzner and D. van der Spoel and E. Lindahl
> GROMACS 4: Algorithms for highly efficient, load-balanced, and scalable
> molecular simulation
> J. Chem. Theory Comput. 4 (2008) pp. 435-447
> -------- -------- --- Thank You --- -------- --------
>
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> D. van der Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark and H. J. C.
> Berendsen
> GROMACS: Fast, Flexible and Free
> J. Comp. Chem. 26 (2005) pp. 1701-1719
> -------- -------- --- Thank You --- -------- --------
>
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> E. Lindahl and B. Hess and D. van der Spoel
> GROMACS 3.0: A package for molecular simulation and trajectory analysis
> J. Mol. Mod. 7 (2001) pp. 306-317
> -------- -------- --- Thank You --- -------- --------
>
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> H. J. C. Berendsen, D. van der Spoel and R. van Drunen
> GROMACS: A message-passing parallel molecular dynamics implementation
> Comp. Phys. Comm. 91 (1995) pp. 43-56
> -------- -------- --- Thank You --- -------- --------
>
> Input Parameters:
> integrator = md
> nsteps = 10000000
> init_step = 0
> ns_type = Grid
> nstlist = 10
> ndelta = 2
> nstcomm = 1
> comm_mode = Linear
> nstlog = 100
> nstxout = 1000
> nstvout = 0
> nstfout = 0
> nstenergy = 100
> nstxtcout = 0
> init_t = 0
> delta_t = 0.002
> xtcprec = 1000
> nkx = 70
> nky = 70
> nkz = 70
> pme_order = 4
> ewald_rtol = 1e-05
> ewald_geometry = 0
> epsilon_surface = 0
> optimize_fft = TRUE
> ePBC = xyz
> bPeriodicMols = FALSE
> bContinuation = FALSE
> bShakeSOR = FALSE
> etc = V-rescale
> epc = Parrinello-Rahman
> epctype = Isotropic
> tau_p = 0.5
> ref_p (3x3):
> ref_p[ 0]={ 1.00000e+00, 0.00000e+00, 0.00000e+00}
> ref_p[ 1]={ 0.00000e+00, 1.00000e+00, 0.00000e+00}
> ref_p[ 2]={ 0.00000e+00, 0.00000e+00, 1.00000e+00}
> compress (3x3):
> compress[ 0]={ 4.50000e-05, 0.00000e+00, 0.00000e+00}
> compress[ 1]={ 0.00000e+00, 4.50000e-05, 0.00000e+00}
> compress[ 2]={ 0.00000e+00, 0.00000e+00, 4.50000e-05}
> refcoord_scaling = No
> posres_com (3):
> posres_com[0]= 0.00000e+00
> posres_com[1]= 0.00000e+00
> posres_com[2]= 0.00000e+00
> posres_comB (3):
> posres_comB[0]= 0.00000e+00
> posres_comB[1]= 0.00000e+00
> posres_comB[2]= 0.00000e+00
> andersen_seed = 815131
> rlist = 1
> rtpi = 0.05
> coulombtype = PME
> rcoulomb_switch = 0
> rcoulomb = 1
> vdwtype = Cut-off
> rvdw_switch = 0
> rvdw = 1.4
> epsilon_r = 1
> epsilon_rf = 1
> tabext = 1
> implicit_solvent = No
> gb_algorithm = Still
> gb_epsilon_solvent = 80
> nstgbradii = 1
> rgbradii = 2
> gb_saltconc = 0
> gb_obc_alpha = 1
> gb_obc_beta = 0.8
> gb_obc_gamma = 4.85
> sa_surface_tension = 2.092
> DispCorr = No
> free_energy = no
> init_lambda = 0
> sc_alpha = 0
> sc_power = 0
> sc_sigma = 0.3
> delta_lambda = 0
> nwall = 0
> wall_type = 9-3
> wall_atomtype[0] = -1
> wall_atomtype[1] = -1
> wall_density[0] = 0
> wall_density[1] = 0
> wall_ewald_zfac = 3
> pull = no
> disre = No
> disre_weighting = Conservative
> disre_mixed = FALSE
> dr_fc = 1000
> dr_tau = 0
> nstdisreout = 100
> orires_fc = 0
> orires_tau = 0
> nstorireout = 100
> dihre-fc = 1000
> em_stepsize = 0.01
> em_tol = 10
> niter = 20
> fc_stepsize = 0
> nstcgsteep = 1000
> nbfgscorr = 10
> ConstAlg = Lincs
> shake_tol = 0.0001
> lincs_order = 4
> lincs_warnangle = 30
> lincs_iter = 1
> bd_fric = 0
> ld_seed = 1993
> cos_accel = 0
> deform (3x3):
> deform[ 0]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
> deform[ 1]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
> deform[ 2]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
> userint1 = 0
> userint2 = 0
> userint3 = 0
> userint4 = 0
> userreal1 = 0
> userreal2 = 0
> userreal3 = 0
> userreal4 = 0
> grpopts:
> nrdf: 6706.82 106800
> ref_t: 300 300
> tau_t: 0.1 0.1
> anneal: No No
> ann_npoints: 0 0
> acc: 0 0 0
> nfreeze: N N N
> energygrp_flags[ 0]: 0 0 0
> energygrp_flags[ 1]: 0 0 0
> energygrp_flags[ 2]: 0 0 0
> efield-x:
> n = 0
> efield-xt:
> n = 0
> efield-y:
> n = 0
> efield-yt:
> n = 0
> efield-z:
> n = 0
> efield-zt:
> n = 0
> bQMMM = FALSE
> QMconstraints = 0
> QMMMscheme = 0
> scalefactor = 1
> qm_opts:
> ngQM = 0
> Table routines are used for coulomb: TRUE
> Table routines are used for vdw: FALSE
> Will do PME sum in reciprocal space.
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> U. Essman, L. Perela, M. L. Berkowitz, T. Darden, H. Lee and L. G. Pedersen
> A smooth particle mesh Ewald method
> J. Chem. Phys. 103 (1995) pp. 8577-8592
> -------- -------- --- Thank You --- -------- --------
>
> Using a Gaussian width (1/beta) of 0.320163 nm for Ewald
> Cut-off's: NS: 1 Coulomb: 1 LJ: 1.4
> System total charge: -0.000
> Generated table with 1200 data points for Ewald.
> Tabscale = 500 points/nm
> Generated table with 1200 data points for LJ6.
> Tabscale = 500 points/nm
> Generated table with 1200 data points for LJ12.
> Tabscale = 500 points/nm
> Generated table with 1200 data points for 1-4 COUL.
> Tabscale = 500 points/nm
> Generated table with 1200 data points for 1-4 LJ6.
> Tabscale = 500 points/nm
> Generated table with 1200 data points for 1-4 LJ12.
> Tabscale = 500 points/nm
>
> Enabling TIP4p water optimization for 17798 molecules.
>
> Configuring nonbonded kernels...
> Testing x86_64 SSE support... present.
>
>
> Removing pbc first time
>
> Initializing LINear Constraint Solver
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> B. Hess and H. Bekker and H. J. C. Berendsen and J. G. E. M. Fraaije
> LINCS: A Linear Constraint Solver for molecular simulations
> J. Comp. Chem. 18 (1997) pp. 1463-1472
> -------- -------- --- Thank You --- -------- --------
>
> The number of constraints is 3439
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> S. Miyamoto and P. A. Kollman
> SETTLE: An Analytical Version of the SHAKE and RATTLE Algorithms for Rigid
> Water Models
> J. Comp. Chem. 13 (1992) pp. 952-962
> -------- -------- --- Thank You --- -------- --------
>
> Center of mass motion removal mode is Linear
> We have the following groups for center of mass motion removal:
> 0: rest
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> G. Bussi, D. Donadio and M. Parrinello
> Canonical sampling through velocity rescaling
> J. Chem. Phys. 126 (2007) pp. 014101
> -------- -------- --- Thank You --- -------- --------
>
> There are: 56781 Atoms
> There are: 17798 VSites
> Max number of connections per atom is 59
> Total number of connections is 216528
> Max number of graph edges per atom is 4
> Total number of graph edges is 113666
>
> Constraining the starting coordinates (step 0)
>
> Constraining the coordinates at t0-dt (step 0)
> RMS relative constraint deviation after constraining: 3.77e-05
> Initial temperature: 299.838 K
>
>
> > > Recently I successfully installed the gromacs-4.0.5 mpi version.
> > > I could run in 8 cpu. but the speed is very slow.
> > > Total number of atoms in the system is 78424.
> > > while running all 8 cpu showing 95-100% CPU.
> > >
> > > How to speed up the calculation.
> > >
> > > Thanks
> > >
> > >
> > That's normal for a system that atoms/cpu ratio.
> > What's your system and what mdp file are you using?
> > --
> > ------------------------------------------------------
> > You haven't given us any diagnostic information. The problem could be
> > that you're not running an MPI GROMACS (show us your configure line,
> > your mdrun command line and the top 50 lines of your .log file).
> >
> > Mark
> >
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20090611/b60097b8/attachment.html>
More information about the gromacs.org_gmx-users
mailing list