[gmx-users] g_tune_pme_mpi is not compatible to mdrun_mpi

Mark Abraham mark.j.abraham at gmail.com
Mon Sep 29 18:17:33 CEST 2014


Hi,

It can't be fixed, because there is no surefire way to run an arbitrary tpr
on arbitrary number of ranks, regardless of how you guess -npme might
succeed. We should just make the check optional, instead of being a deal
breaker.

Mark
On Sep 29, 2014 4:35 PM, "Carsten Kutzner" <ckutzne at gwdg.de> wrote:

> Hi,
>
> I see where the problem is.
> There is an initial check in g_tune_pme to make sure that parallel
> runs can be executed at all. This is being run with the automatic
> number of PME-only ranks, which is 11 for your input file.
> Unfortunately, this results in 37 PP ranks, for which no domain
> decomposition can be found.
>
> At some point in the past we discussed that this could happen
> and it should be fixed. Will open a bug entry.
>
> Thanks,
>   Carsten
>
>
> On 29 Sep 2014, at 15:36, Ebert Maximilian <m.ebert at umontreal.ca> wrote:
>
> > Hi,
> >
> > this ist he command:
> >
> > setenv MDRUN mdrun_mpi
> >
> > g_tune_pme_mpi -np 48 -s ../eq_nvt/1ZG4_nvt.tpr -launch
> >
> >
> > Here the output of perf.out
> >
> > ------------------------------------------------------------
> >
> >      P E R F O R M A N C E   R E S U L T S
> >
> > ------------------------------------------------------------
> > g_tune_pme_mpi for Gromacs VERSION 5.0.1
> > Number of ranks         : 48
> > The mpirun command is   : mpirun
> > Passing # of ranks via  : -np
> > The mdrun  command is   : mdrun_mpi
> > mdrun args benchmarks   : -resetstep 100 -o bench.trr -x bench.xtc -cpo
> bench.cpt -c bench.gro -e bench.edr -g bench.log
> > Benchmark steps         : 1000
> > dlb equilibration steps : 100
> > mdrun args at launchtime:
> > Repeats for each test   : 2
> > Input file              : ../eq_nvt/1ZG4_nvt.tpr
> >   PME/PP load estimate : 0.151964
> >   Number of particles  : 39489
> >   Coulomb type         : PME
> >   Grid spacing x y z   : 0.114561 0.114561 0.114561
> >   Van der Waals type   : Cut-off
> >
> > Will try these real/reciprocal workload settings:
> > No.   scaling  rcoulomb  nkx  nky  nkz   spacing      rvdw  tpr file
> >   0  1.000000  1.200000   72   72   72  0.120000   1.200000
> ../eq_nvt/1ZG4_nvt_bench00.tpr
> >   1  1.100000  1.320000   64   64   64  0.132000   1.320000
> ../eq_nvt/1ZG4_nvt_bench01.tpr
> >   2  1.200000  1.440000   60   60   60  0.144000   1.440000
> ../eq_nvt/1ZG4_nvt_bench02.tpr
> >
> > Note that in addition to the Coulomb radius and the Fourier grid
> > other input settings were also changed (see table above).
> > Please check if the modified settings are appropriate.
> >
> > Individual timings for input file 0 (../eq_nvt/1ZG4_nvt_bench00.tpr):
> > PME ranks      Gcycles       ns/day        PME/f    Remark
> >
> > ------------------------------------------------------------
> > Cannot run the benchmark simulations! Please check the error message of
> > mdrun for the source of the problem. Did you provide a command line
> > argument that neither g_tune_pme nor mdrun understands? Offending
> command:
> >
> > mpirun -np 48 mdrun_mpi -npme 11 -s ../eq_nvt/1ZG4_nvt_bench00.tpr
> -resetstep 100 -o bench.trr -x bench.xtc -cpo bench.cpt -c bench.gro -e
> bench.edr -g bench.log  -nsteps 1 -quiet
> >
> >
> >
> > and here are parts of the bench.log:
> >
> > Log file opened on Mon Sep 29 08:56:38 2014
> > Host: node-e1-67  pid: 24470  rank ID: 0  number of ranks:  48
> > GROMACS:    gmx mdrun, VERSION 5.0.1
> >
> > GROMACS is written by:
> > Emile Apol         Rossen Apostolov   Herman J.C. Berendsen Par Bjelkmar
> > Aldert van Buuren  Rudi van Drunen    Anton Feenstra     Sebastian
> Fritsch
> > Gerrit Groenhof    Christoph Junghans Peter Kasson       Carsten Kutzner
> > Per Larsson        Justin A. Lemkul   Magnus Lundborg    Pieter
> Meulenhoff
> > Erik Marklund      Teemu Murtola      Szilard Pall       Sander Pronk
> > Roland Schulz      Alexey Shvetsov    Michael Shirts     Alfons Sijbers
> > Peter Tieleman     Christian Wennberg Maarten Wolf
> > and the project leaders:
> > Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel
> >
> > Copyright (c) 1991-2000, University of Groningen, The Netherlands.
> > Copyright (c) 2001-2014, The GROMACS development team at
> > Uppsala University, Stockholm University and
> > the Royal Institute of Technology, Sweden.
> > check out http://www.gromacs.org for more information.
> >
> > GROMACS is free software; you can redistribute it and/or modify it
> > under the terms of the GNU Lesser General Public License
> > as published by the Free Software Foundation; either version 2.1
> > of the License, or (at your option) any later version.
> >
> > GROMACS:      gmx mdrun, VERSION 5.0.1
> > Executable:   /home/apps/Logiciels/gromacs/gromacs-5.0.1/bin/gmx_mpi
> > Library dir:
> /home/apps/Logiciels/gromacs/gromacs-5.0.1/share/gromacs/top
> > Command line:
> >  mdrun_mpi -npme 11 -s ../eq_nvt/1ZG4_nvt_bench00.tpr -resetstep 100 -o
> bench.trr -x bench.xtc -cpo bench.cpt -c bench.gro -e bench.edr -g
> bench.log -nsteps 1 -quiet
> >
> > Gromacs version:    VERSION 5.0.1
> > Precision:          single
> > Memory model:       64 bit
> > MPI library:        MPI
> > OpenMP support:     enabled
> > GPU support:        disabled
> > invsqrt routine:    gmx_software_invsqrt(x)
> > SIMD instructions:  SSE4.1
> > FFT library:        fftw-3.3.3-sse2
> > RDTSCP usage:       enabled
> > C++11 compilation:  enabled
> > TNG support:        enabled
> > Tracing support:    disabled
> > Built on:           Tue Sep 23 09:58:07 EDT 2014
> > Built by:           rqchpbib at briaree1 [CMAKE]
> > Build OS/arch:      Linux 2.6.32-71.el6.x86_64 x86_64
> > Build CPU vendor:   GenuineIntel
> > Build CPU brand:    Intel(R) Xeon(R) CPU           X5650  @ 2.67GHz
> > Build CPU family:   6   Model: 44   Stepping: 2
> > Build CPU features: aes apic clfsh cmov cx8 cx16 htt lahf_lm mmx msr
> nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdtscp sse2 sse3 sse4.1
> sse4.2 ssse3
> > C compiler:         /RQusagers/apps/Logiciels/gcc/4.8.1/bin/gcc GNU 4.8.1
> > C compiler flags:    -msse4.1   -Wno-maybe-uninitialized -Wextra
> -Wno-missing-field-initializers -Wno-sign-compare -Wpointer-arith -Wall
> -Wno-unused -Wunused-value -Wunused-parameter   -fomit-frame-pointer
> -funroll-all-loops -fexcess-precision=fast  -Wno-array-bounds  -O3 -DNDEBUG
> > C++ compiler:       /RQusagers/apps/Logiciels/gcc/4.8.1/bin/g++ GNU 4.8.1
> > C++ compiler flags:  -msse4.1   -std=c++0x -Wextra
> -Wno-missing-field-initializers -Wpointer-arith -Wall -Wno-unused-function
>  -fomit-frame-pointer -funroll-all-loops -fexcess-precision=fast
> -Wno-array-bounds  -O3 -DNDEBUG
> > Boost version:      1.55.0 (internal)
> >
> >
> > ....
> >
> >
> >      n = 0
> >   E-zt:
> >      n = 0
> >   swapcoords                     = no
> >   adress                         = FALSE
> >   userint1                       = 0
> >   userint2                       = 0
> >   userint3                       = 0
> >   userint4                       = 0
> >   userreal1                      = 0
> >   userreal2                      = 0
> >   userreal3                      = 0
> >   userreal4                      = 0
> > grpopts:
> >   nrdf:     10175.6     70836.4
> >   ref-t:      304.65      304.65
> >   tau-t:         0.5         0.5
> > annealing:      Single      Single
> > annealing-npoints:           4           4
> > annealing-time [0]:              0.0       200.0       300.0       750.0
> > annealing-temp [0]:             10.0       100.0       100.0       304.6
> > annealing-time [1]:              0.0       200.0       300.0       750.0
> > annealing-temp [1]:             10.0       100.0       100.0       304.6
> >   acc:            0           0           0
> >   nfreeze:           N           N           N
> >   energygrp-flags[  0]: 0
> >
> > Overriding nsteps with value passed on the command line: 1 steps, 0.002
> ps
> >
> >
> > Initializing Domain Decomposition on 48 ranks
> > Dynamic load balancing: auto
> > Will sort the charge groups at every domain (re)decomposition
> > Initial maximum inter charge-group distances:
> >    two-body bonded interactions: 0.422 nm, LJ-14, atoms 1444 1452
> >  multi-body bonded interactions: 0.422 nm, Proper Dih., atoms 1444 1452
> > Minimum cell size due to bonded interactions: 0.464 nm
> > Maximum distance for 5 constraints, at 120 deg. angles, all-trans: 0.218
> nm
> > Estimated maximum distance required for P-LINCS: 0.218 nm
> >
> > -------------------------------------------------------
> > Program mdrun_mpi, VERSION 5.0.1
> > Source code file:
> /RQusagers/rqchpbib/stubbsda/gromacs-5.0.1/src/gromacs/mdlib/domdec_setup.c,
> line: 728
> >
> > Fatal error:
> > The number of ranks you selected (37) contains a large prime factor 37.
> In most cases this will lead to bad performance. Choose a number with
> smaller prime factors or set the decomposition (option -dd) manually.
> > For more information and tips for troubleshooting, please check the
> GROMACS
> > website at http://www.gromacs.org/Documentation/Errors
> > -------------------------------------------------------
> > -----Original Message-----
> > From: gromacs.org_gmx-users-bounces at maillist.sys.kth.se [mailto:
> gromacs.org_gmx-users-bounces at maillist.sys.kth.se] On Behalf Of Carsten
> Kutzner
> > Sent: Montag, 29. September 2014 15:23
> > To: gmx-users at gromacs.org
> > Subject: Re: [gmx-users] g_tune_pme_mpi is not compatible to mdrun_mpi
> >
> > Hi,
> >
> > is this the only output?
> >
> > Don't you get a perf.out file that lists which settings are optimal?
> >
> > What exactly was the command line you used?
> >
> > Carsten
> >
> >
> > On 29 Sep 2014, at 15:01, Ebert Maximilian <m.ebert at umontreal.ca> wrote:
> >
> >> Hi,
> >>
> >> I just tried that and I got the following error message (bench.log).
> Any idea what could be wrong?
> >>
> >> Thank you very much,
> >>
> >> Max
> >>
> >> Initializing Domain Decomposition on 48 ranks Dynamic load balancing:
> >> auto Will sort the charge groups at every domain (re)decomposition
> >> Initial maximum inter charge-group distances:
> >>   two-body bonded interactions: 0.422 nm, LJ-14, atoms 1444 1452
> >> multi-body bonded interactions: 0.422 nm, Proper Dih., atoms 1444 1452
> >> Minimum cell size due to bonded interactions: 0.464 nm Maximum
> >> distance for 5 constraints, at 120 deg. angles, all-trans: 0.218 nm
> >> Estimated maximum distance required for P-LINCS: 0.218 nm
> >>
> >> -------------------------------------------------------
> >> Program mdrun_mpi, VERSION 5.0.1
> >> Source code file:
> >> /RQusagers/rqchpbib/stubbsda/gromacs-5.0.1/src/gromacs/mdlib/domdec_se
> >> tup.c, line: 728
> >>
> >> Fatal error:
> >> The number of ranks you selected (37) contains a large prime factor 37.
> In most cases this will lead to bad performance. Choose a number with
> smaller prime factors or set the decomposition (option -dd) manually.
> >> For more information and tips for troubleshooting, please check the
> >> GROMACS website at http://www.gromacs.org/Documentation/Errors
> >> -------------------------------------------------------
> >>
> >> -----Original Message-----
> >> From: gromacs.org_gmx-users-bounces at maillist.sys.kth.se
> >> [mailto:gromacs.org_gmx-users-bounces at maillist.sys.kth.se] On Behalf
> >> Of Carsten Kutzner
> >> Sent: Donnerstag, 25. September 2014 19:29
> >> To: gmx-users at gromacs.org
> >> Subject: Re: [gmx-users] g_tune_pme_mpi is not compatible to mdrun_mpi
> >>
> >> Hi,
> >>
> >> don't invoke g_tune_pme with 'mpirun', because it is a serial
> executable that itself invokes parallel MD runs for testing.
> >>
> >> use
> >> export MDRUN=mdrun_mpi
> >>
> >> g_tune_pme -np 24 -s 1ZG4_nvt.tpr -launch
> >>
> >> see also
> >>
> >> g_tune_pme -h
> >>
> >> You may need to recompile g_tune_pme without MPI enabled (depends on
> >> your queueing system)
> >>
> >> Best,
> >> Carsten
> >>
> >>
> >> On 25 Sep 2014, at 15:10, Ebert Maximilian <m.ebert at umontreal.ca>
> wrote:
> >>
> >>> Dear list,
> >>>
> >>> I tried using g_tune_pme_mpi with the command:
> >>>
> >>> mpirun -np 24 g_tune_pme_mpi -np 24 -s 1ZG4_nvt.tpr -launch
> >>>
> >>> on GROMACS 5.0.1 but I get the following error message:
> >>> ---------------------------------------------------------------------
> >>> -
> >>> ---- mpirun was unable to launch the specified application as it
> >>> could not find an executable:
> >>>
> >>> Executable: mdrun
> >>> Node: xxxx
> >>>
> >>> while attempting to start process rank 0.
> >>> ---------------------------------------------------------------------
> >>> -
> >>> ----
> >>> 24 total processes failed to start
> >>>
> >>>
> >>> Any idea why this is? Shouldn't g_tune_pme_mpi call mdrun_mpi instead?
> >>>
> >>> Thank you very much,
> >>>
> >>> Max
> >>> --
> >>> Gromacs Users mailing list
> >>>
> >>> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
> >>>
> >>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>>
> >>> * For (un)subscribe requests visit
> >>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
> >>
> >>
> >> --
> >> Dr. Carsten Kutzner
> >> Max Planck Institute for Biophysical Chemistry Theoretical and
> >> Computational Biophysics Am Fassberg 11, 37077 Goettingen, Germany
> >> Tel. +49-551-2012313, Fax: +49-551-2012302
> >> http://www.mpibpc.mpg.de/grubmueller/kutzner
> >> http://www.mpibpc.mpg.de/grubmueller/sppexa
> >>
> >> --
> >> Gromacs Users mailing list
> >>
> >> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
> >>
> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>
> >> * For (un)subscribe requests visit
> >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
> >> --
> >> Gromacs Users mailing list
> >>
> >> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
> >>
> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>
> >> * For (un)subscribe requests visit
> >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
> >
> >
> > --
> > Dr. Carsten Kutzner
> > Max Planck Institute for Biophysical Chemistry Theoretical and
> Computational Biophysics Am Fassberg 11, 37077 Goettingen, Germany Tel.
> +49-551-2012313, Fax: +49-551-2012302
> http://www.mpibpc.mpg.de/grubmueller/kutzner
> > http://www.mpibpc.mpg.de/grubmueller/sppexa
> >
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>
>
> --
> Dr. Carsten Kutzner
> Max Planck Institute for Biophysical Chemistry
> Theoretical and Computational Biophysics
> Am Fassberg 11, 37077 Goettingen, Germany
> Tel. +49-551-2012313, Fax: +49-551-2012302
> http://www.mpibpc.mpg.de/grubmueller/kutzner
> http://www.mpibpc.mpg.de/grubmueller/sppexa
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>


More information about the gromacs.org_gmx-users mailing list