[gmx-users] g_tune_pme_mpi is not compatible to mdrun_mpi
    Carsten Kutzner 
    ckutzne at gwdg.de
       
    Mon Sep 29 18:32:42 CEST 2014
    
    
  
Hi,
On 29 Sep 2014, at 18:17, Mark Abraham <mark.j.abraham at gmail.com> wrote:
> Hi,
> 
> It can't be fixed, because there is no surefire way to run an arbitrary tpr
> on arbitrary number of ranks, regardless of how you guess -npme might
> succeed.
What about making this check on two ranks always, regardless of
what was specified on the g_tune_pme command line? On two ranks,
we will never have separate PME ranks, so it should always work, 
since we end up with two ranks, doing PP and then PME.
If the system is so small that you can not decompose it in two
DD domains, there is no use to do tuning anyway.
So even if you say
g_tune_pme -np 48 -s input.tpr
we first check with
mpirun -np 2 mdrun -s input.tpr
and only after that continue with -np 48.
Carsten
 
> We should just make the check optional, instead of being a deal
> breaker.
> 
> Mark
> On Sep 29, 2014 4:35 PM, "Carsten Kutzner" <ckutzne at gwdg.de> wrote:
> 
>> Hi,
>> 
>> I see where the problem is.
>> There is an initial check in g_tune_pme to make sure that parallel
>> runs can be executed at all. This is being run with the automatic
>> number of PME-only ranks, which is 11 for your input file.
>> Unfortunately, this results in 37 PP ranks, for which no domain
>> decomposition can be found.
>> 
>> At some point in the past we discussed that this could happen
>> and it should be fixed. Will open a bug entry.
>> 
>> Thanks,
>>  Carsten
>> 
>> 
>> On 29 Sep 2014, at 15:36, Ebert Maximilian <m.ebert at umontreal.ca> wrote:
>> 
>>> Hi,
>>> 
>>> this ist he command:
>>> 
>>> setenv MDRUN mdrun_mpi
>>> 
>>> g_tune_pme_mpi -np 48 -s ../eq_nvt/1ZG4_nvt.tpr -launch
>>> 
>>> 
>>> Here the output of perf.out
>>> 
>>> ------------------------------------------------------------
>>> 
>>>     P E R F O R M A N C E   R E S U L T S
>>> 
>>> ------------------------------------------------------------
>>> g_tune_pme_mpi for Gromacs VERSION 5.0.1
>>> Number of ranks         : 48
>>> The mpirun command is   : mpirun
>>> Passing # of ranks via  : -np
>>> The mdrun  command is   : mdrun_mpi
>>> mdrun args benchmarks   : -resetstep 100 -o bench.trr -x bench.xtc -cpo
>> bench.cpt -c bench.gro -e bench.edr -g bench.log
>>> Benchmark steps         : 1000
>>> dlb equilibration steps : 100
>>> mdrun args at launchtime:
>>> Repeats for each test   : 2
>>> Input file              : ../eq_nvt/1ZG4_nvt.tpr
>>>  PME/PP load estimate : 0.151964
>>>  Number of particles  : 39489
>>>  Coulomb type         : PME
>>>  Grid spacing x y z   : 0.114561 0.114561 0.114561
>>>  Van der Waals type   : Cut-off
>>> 
>>> Will try these real/reciprocal workload settings:
>>> No.   scaling  rcoulomb  nkx  nky  nkz   spacing      rvdw  tpr file
>>>  0  1.000000  1.200000   72   72   72  0.120000   1.200000
>> ../eq_nvt/1ZG4_nvt_bench00.tpr
>>>  1  1.100000  1.320000   64   64   64  0.132000   1.320000
>> ../eq_nvt/1ZG4_nvt_bench01.tpr
>>>  2  1.200000  1.440000   60   60   60  0.144000   1.440000
>> ../eq_nvt/1ZG4_nvt_bench02.tpr
>>> 
>>> Note that in addition to the Coulomb radius and the Fourier grid
>>> other input settings were also changed (see table above).
>>> Please check if the modified settings are appropriate.
>>> 
>>> Individual timings for input file 0 (../eq_nvt/1ZG4_nvt_bench00.tpr):
>>> PME ranks      Gcycles       ns/day        PME/f    Remark
>>> 
>>> ------------------------------------------------------------
>>> Cannot run the benchmark simulations! Please check the error message of
>>> mdrun for the source of the problem. Did you provide a command line
>>> argument that neither g_tune_pme nor mdrun understands? Offending
>> command:
>>> 
>>> mpirun -np 48 mdrun_mpi -npme 11 -s ../eq_nvt/1ZG4_nvt_bench00.tpr
>> -resetstep 100 -o bench.trr -x bench.xtc -cpo bench.cpt -c bench.gro -e
>> bench.edr -g bench.log  -nsteps 1 -quiet
>>> 
>>> 
>>> 
>>> and here are parts of the bench.log:
>>> 
>>> Log file opened on Mon Sep 29 08:56:38 2014
>>> Host: node-e1-67  pid: 24470  rank ID: 0  number of ranks:  48
>>> GROMACS:    gmx mdrun, VERSION 5.0.1
>>> 
>>> GROMACS is written by:
>>> Emile Apol         Rossen Apostolov   Herman J.C. Berendsen Par Bjelkmar
>>> Aldert van Buuren  Rudi van Drunen    Anton Feenstra     Sebastian
>> Fritsch
>>> Gerrit Groenhof    Christoph Junghans Peter Kasson       Carsten Kutzner
>>> Per Larsson        Justin A. Lemkul   Magnus Lundborg    Pieter
>> Meulenhoff
>>> Erik Marklund      Teemu Murtola      Szilard Pall       Sander Pronk
>>> Roland Schulz      Alexey Shvetsov    Michael Shirts     Alfons Sijbers
>>> Peter Tieleman     Christian Wennberg Maarten Wolf
>>> and the project leaders:
>>> Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel
>>> 
>>> Copyright (c) 1991-2000, University of Groningen, The Netherlands.
>>> Copyright (c) 2001-2014, The GROMACS development team at
>>> Uppsala University, Stockholm University and
>>> the Royal Institute of Technology, Sweden.
>>> check out http://www.gromacs.org for more information.
>>> 
>>> GROMACS is free software; you can redistribute it and/or modify it
>>> under the terms of the GNU Lesser General Public License
>>> as published by the Free Software Foundation; either version 2.1
>>> of the License, or (at your option) any later version.
>>> 
>>> GROMACS:      gmx mdrun, VERSION 5.0.1
>>> Executable:   /home/apps/Logiciels/gromacs/gromacs-5.0.1/bin/gmx_mpi
>>> Library dir:
>> /home/apps/Logiciels/gromacs/gromacs-5.0.1/share/gromacs/top
>>> Command line:
>>> mdrun_mpi -npme 11 -s ../eq_nvt/1ZG4_nvt_bench00.tpr -resetstep 100 -o
>> bench.trr -x bench.xtc -cpo bench.cpt -c bench.gro -e bench.edr -g
>> bench.log -nsteps 1 -quiet
>>> 
>>> Gromacs version:    VERSION 5.0.1
>>> Precision:          single
>>> Memory model:       64 bit
>>> MPI library:        MPI
>>> OpenMP support:     enabled
>>> GPU support:        disabled
>>> invsqrt routine:    gmx_software_invsqrt(x)
>>> SIMD instructions:  SSE4.1
>>> FFT library:        fftw-3.3.3-sse2
>>> RDTSCP usage:       enabled
>>> C++11 compilation:  enabled
>>> TNG support:        enabled
>>> Tracing support:    disabled
>>> Built on:           Tue Sep 23 09:58:07 EDT 2014
>>> Built by:           rqchpbib at briaree1 [CMAKE]
>>> Build OS/arch:      Linux 2.6.32-71.el6.x86_64 x86_64
>>> Build CPU vendor:   GenuineIntel
>>> Build CPU brand:    Intel(R) Xeon(R) CPU           X5650  @ 2.67GHz
>>> Build CPU family:   6   Model: 44   Stepping: 2
>>> Build CPU features: aes apic clfsh cmov cx8 cx16 htt lahf_lm mmx msr
>> nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdtscp sse2 sse3 sse4.1
>> sse4.2 ssse3
>>> C compiler:         /RQusagers/apps/Logiciels/gcc/4.8.1/bin/gcc GNU 4.8.1
>>> C compiler flags:    -msse4.1   -Wno-maybe-uninitialized -Wextra
>> -Wno-missing-field-initializers -Wno-sign-compare -Wpointer-arith -Wall
>> -Wno-unused -Wunused-value -Wunused-parameter   -fomit-frame-pointer
>> -funroll-all-loops -fexcess-precision=fast  -Wno-array-bounds  -O3 -DNDEBUG
>>> C++ compiler:       /RQusagers/apps/Logiciels/gcc/4.8.1/bin/g++ GNU 4.8.1
>>> C++ compiler flags:  -msse4.1   -std=c++0x -Wextra
>> -Wno-missing-field-initializers -Wpointer-arith -Wall -Wno-unused-function
>> -fomit-frame-pointer -funroll-all-loops -fexcess-precision=fast
>> -Wno-array-bounds  -O3 -DNDEBUG
>>> Boost version:      1.55.0 (internal)
>>> 
>>> 
>>> ....
>>> 
>>> 
>>>     n = 0
>>>  E-zt:
>>>     n = 0
>>>  swapcoords                     = no
>>>  adress                         = FALSE
>>>  userint1                       = 0
>>>  userint2                       = 0
>>>  userint3                       = 0
>>>  userint4                       = 0
>>>  userreal1                      = 0
>>>  userreal2                      = 0
>>>  userreal3                      = 0
>>>  userreal4                      = 0
>>> grpopts:
>>>  nrdf:     10175.6     70836.4
>>>  ref-t:      304.65      304.65
>>>  tau-t:         0.5         0.5
>>> annealing:      Single      Single
>>> annealing-npoints:           4           4
>>> annealing-time [0]:              0.0       200.0       300.0       750.0
>>> annealing-temp [0]:             10.0       100.0       100.0       304.6
>>> annealing-time [1]:              0.0       200.0       300.0       750.0
>>> annealing-temp [1]:             10.0       100.0       100.0       304.6
>>>  acc:            0           0           0
>>>  nfreeze:           N           N           N
>>>  energygrp-flags[  0]: 0
>>> 
>>> Overriding nsteps with value passed on the command line: 1 steps, 0.002
>> ps
>>> 
>>> 
>>> Initializing Domain Decomposition on 48 ranks
>>> Dynamic load balancing: auto
>>> Will sort the charge groups at every domain (re)decomposition
>>> Initial maximum inter charge-group distances:
>>>   two-body bonded interactions: 0.422 nm, LJ-14, atoms 1444 1452
>>> multi-body bonded interactions: 0.422 nm, Proper Dih., atoms 1444 1452
>>> Minimum cell size due to bonded interactions: 0.464 nm
>>> Maximum distance for 5 constraints, at 120 deg. angles, all-trans: 0.218
>> nm
>>> Estimated maximum distance required for P-LINCS: 0.218 nm
>>> 
>>> -------------------------------------------------------
>>> Program mdrun_mpi, VERSION 5.0.1
>>> Source code file:
>> /RQusagers/rqchpbib/stubbsda/gromacs-5.0.1/src/gromacs/mdlib/domdec_setup.c,
>> line: 728
>>> 
>>> Fatal error:
>>> The number of ranks you selected (37) contains a large prime factor 37.
>> In most cases this will lead to bad performance. Choose a number with
>> smaller prime factors or set the decomposition (option -dd) manually.
>>> For more information and tips for troubleshooting, please check the
>> GROMACS
>>> website at http://www.gromacs.org/Documentation/Errors
>>> -------------------------------------------------------
>>> -----Original Message-----
>>> From: gromacs.org_gmx-users-bounces at maillist.sys.kth.se [mailto:
>> gromacs.org_gmx-users-bounces at maillist.sys.kth.se] On Behalf Of Carsten
>> Kutzner
>>> Sent: Montag, 29. September 2014 15:23
>>> To: gmx-users at gromacs.org
>>> Subject: Re: [gmx-users] g_tune_pme_mpi is not compatible to mdrun_mpi
>>> 
>>> Hi,
>>> 
>>> is this the only output?
>>> 
>>> Don't you get a perf.out file that lists which settings are optimal?
>>> 
>>> What exactly was the command line you used?
>>> 
>>> Carsten
>>> 
>>> 
>>> On 29 Sep 2014, at 15:01, Ebert Maximilian <m.ebert at umontreal.ca> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> I just tried that and I got the following error message (bench.log).
>> Any idea what could be wrong?
>>>> 
>>>> Thank you very much,
>>>> 
>>>> Max
>>>> 
>>>> Initializing Domain Decomposition on 48 ranks Dynamic load balancing:
>>>> auto Will sort the charge groups at every domain (re)decomposition
>>>> Initial maximum inter charge-group distances:
>>>>  two-body bonded interactions: 0.422 nm, LJ-14, atoms 1444 1452
>>>> multi-body bonded interactions: 0.422 nm, Proper Dih., atoms 1444 1452
>>>> Minimum cell size due to bonded interactions: 0.464 nm Maximum
>>>> distance for 5 constraints, at 120 deg. angles, all-trans: 0.218 nm
>>>> Estimated maximum distance required for P-LINCS: 0.218 nm
>>>> 
>>>> -------------------------------------------------------
>>>> Program mdrun_mpi, VERSION 5.0.1
>>>> Source code file:
>>>> /RQusagers/rqchpbib/stubbsda/gromacs-5.0.1/src/gromacs/mdlib/domdec_se
>>>> tup.c, line: 728
>>>> 
>>>> Fatal error:
>>>> The number of ranks you selected (37) contains a large prime factor 37.
>> In most cases this will lead to bad performance. Choose a number with
>> smaller prime factors or set the decomposition (option -dd) manually.
>>>> For more information and tips for troubleshooting, please check the
>>>> GROMACS website at http://www.gromacs.org/Documentation/Errors
>>>> -------------------------------------------------------
>>>> 
>>>> -----Original Message-----
>>>> From: gromacs.org_gmx-users-bounces at maillist.sys.kth.se
>>>> [mailto:gromacs.org_gmx-users-bounces at maillist.sys.kth.se] On Behalf
>>>> Of Carsten Kutzner
>>>> Sent: Donnerstag, 25. September 2014 19:29
>>>> To: gmx-users at gromacs.org
>>>> Subject: Re: [gmx-users] g_tune_pme_mpi is not compatible to mdrun_mpi
>>>> 
>>>> Hi,
>>>> 
>>>> don't invoke g_tune_pme with 'mpirun', because it is a serial
>> executable that itself invokes parallel MD runs for testing.
>>>> 
>>>> use
>>>> export MDRUN=mdrun_mpi
>>>> 
>>>> g_tune_pme -np 24 -s 1ZG4_nvt.tpr -launch
>>>> 
>>>> see also
>>>> 
>>>> g_tune_pme -h
>>>> 
>>>> You may need to recompile g_tune_pme without MPI enabled (depends on
>>>> your queueing system)
>>>> 
>>>> Best,
>>>> Carsten
>>>> 
>>>> 
>>>> On 25 Sep 2014, at 15:10, Ebert Maximilian <m.ebert at umontreal.ca>
>> wrote:
>>>> 
>>>>> Dear list,
>>>>> 
>>>>> I tried using g_tune_pme_mpi with the command:
>>>>> 
>>>>> mpirun -np 24 g_tune_pme_mpi -np 24 -s 1ZG4_nvt.tpr -launch
>>>>> 
>>>>> on GROMACS 5.0.1 but I get the following error message:
>>>>> ---------------------------------------------------------------------
>>>>> -
>>>>> ---- mpirun was unable to launch the specified application as it
>>>>> could not find an executable:
>>>>> 
>>>>> Executable: mdrun
>>>>> Node: xxxx
>>>>> 
>>>>> while attempting to start process rank 0.
>>>>> ---------------------------------------------------------------------
>>>>> -
>>>>> ----
>>>>> 24 total processes failed to start
>>>>> 
>>>>> 
>>>>> Any idea why this is? Shouldn't g_tune_pme_mpi call mdrun_mpi instead?
>>>>> 
>>>>> Thank you very much,
>>>>> 
>>>>> Max
>>>>> --
>>>>> Gromacs Users mailing list
>>>>> 
>>>>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> posting!
>>>>> 
>>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>>> 
>>>>> * For (un)subscribe requests visit
>>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-request at gromacs.org.
>>>> 
>>>> 
>>>> --
>>>> Dr. Carsten Kutzner
>>>> Max Planck Institute for Biophysical Chemistry Theoretical and
>>>> Computational Biophysics Am Fassberg 11, 37077 Goettingen, Germany
>>>> Tel. +49-551-2012313, Fax: +49-551-2012302
>>>> http://www.mpibpc.mpg.de/grubmueller/kutzner
>>>> http://www.mpibpc.mpg.de/grubmueller/sppexa
>>>> 
>>>> --
>>>> Gromacs Users mailing list
>>>> 
>>>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> posting!
>>>> 
>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>> 
>>>> * For (un)subscribe requests visit
>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-request at gromacs.org.
>>>> --
>>>> Gromacs Users mailing list
>>>> 
>>>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> posting!
>>>> 
>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>> 
>>>> * For (un)subscribe requests visit
>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-request at gromacs.org.
>>> 
>>> 
>>> --
>>> Dr. Carsten Kutzner
>>> Max Planck Institute for Biophysical Chemistry Theoretical and
>> Computational Biophysics Am Fassberg 11, 37077 Goettingen, Germany Tel.
>> +49-551-2012313, Fax: +49-551-2012302
>> http://www.mpibpc.mpg.de/grubmueller/kutzner
>>> http://www.mpibpc.mpg.de/grubmueller/sppexa
>>> 
>>> --
>>> Gromacs Users mailing list
>>> 
>>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> posting!
>>> 
>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>> 
>>> * For (un)subscribe requests visit
>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-request at gromacs.org.
>>> --
>>> Gromacs Users mailing list
>>> 
>>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> posting!
>>> 
>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>> 
>>> * For (un)subscribe requests visit
>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-request at gromacs.org.
>> 
>> 
>> --
>> Dr. Carsten Kutzner
>> Max Planck Institute for Biophysical Chemistry
>> Theoretical and Computational Biophysics
>> Am Fassberg 11, 37077 Goettingen, Germany
>> Tel. +49-551-2012313, Fax: +49-551-2012302
>> http://www.mpibpc.mpg.de/grubmueller/kutzner
>> http://www.mpibpc.mpg.de/grubmueller/sppexa
>> 
>> --
>> Gromacs Users mailing list
>> 
>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> posting!
>> 
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>> 
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-request at gromacs.org.
>> 
> -- 
> Gromacs Users mailing list
> 
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
> 
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> 
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.
--
Dr. Carsten Kutzner
Max Planck Institute for Biophysical Chemistry
Theoretical and Computational Biophysics
Am Fassberg 11, 37077 Goettingen, Germany
Tel. +49-551-2012313, Fax: +49-551-2012302
http://www.mpibpc.mpg.de/grubmueller/kutzner
http://www.mpibpc.mpg.de/grubmueller/sppexa
    
    
More information about the gromacs.org_gmx-users
mailing list