[gmx-users] Slow Runs

Justin A. Lemkul jalemkul at vt.edu
Fri Jan 28 21:44:59 CET 2011



Denny Frost wrote:
> I'm leaning toward the possibility that it is actually only running 8 
> copies of the same job on different processors.  My question is how does 
> gromacs4.5 know how many processors it has available to parallelize a 
> job?  Is it specified in grompp or does it just detect it? 
> 

If you're using MPI, it comes from mpiexec/mpirun/whatever.  Setting a proper 
flag there is what tells mdrun how many nodes to use.

-Justin

> On Fri, Jan 28, 2011 at 1:32 PM, Justin A. Lemkul <jalemkul at vt.edu 
> <mailto:jalemkul at vt.edu>> wrote:
> 
> 
> 
>     Denny Frost wrote:
> 
>         Here's my grompp command:
> 
>         grompp_d -nice 0 -v -f md.mdp -c ReadyForMD.gro -o md.tpr -p top.top
> 
>         and my mdrun command is this:
>         time mpiexec mdrun_mpi -np 8 -cpt 30000 -nice 0 -nt 1 -s
>         $PBS_O_WORKDIR/md.tpr -o $PBS_O_WORKDIR/mdDone.trr -x
>         $PBS_O_WORKDIR/mdDone.xtc -c $PBS_O_WORKDIR/mdDone.gro -e
>         $PBS_O_WORKDIR/md.edr -g $PBS_O_WORKDIR/md.log 1>
>         $PBS_JOBID.pgm.out 4> $PBS_JOBID.pgm.err
> 
> 
>     The -np option of mdrun is nonexistent, but mdrun does not check for
>     proper command line arguments, so you won't get an error.  But then
>     you've said that 8 processors are active, so I still suspect that
>     mdrun was compiled incorrectly or in such a way that it's
>     incompatible with your system.  The output from the .log file
>     indicates that only one processor was used.  Maybe your admins can
>     help you on this one, if the jobs spit out any useful diagnostic
>     information.
> 
>     For our cluster, we use e.g.:
> 
>     mpirun -np 8 mdrun_mpi -deffnm md
> 
>     -Justin
> 
>         I know the -cpt option is 30000 because I don't want a
>         checkpoint file because every time it tries to make it, it fails
>         due to quota issues and kills the job.  I'm not sure why this
>         happens, but I think it's a separate issue to take up with my
>         supercomputing facility.
> 
>         On Fri, Jan 28, 2011 at 1:18 PM, Justin A. Lemkul
>         <jalemkul at vt.edu <mailto:jalemkul at vt.edu>
>         <mailto:jalemkul at vt.edu <mailto:jalemkul at vt.edu>>> wrote:
> 
> 
> 
>            Denny Frost wrote:
> 
>                all 8 nodes are running at full capacity, though
> 
> 
>            What is your mdrun command line?  How did you compile it?
>          What can
>            happen is something went wrong during installation, so you
>         think you
>            have an MPI-enabled binary, but it is simply executing 8
>         copies of
>            the same job.
> 
>            -Justin
> 
>                On Fri, Jan 28, 2011 at 1:13 PM, Justin A. Lemkul
>                <jalemkul at vt.edu <mailto:jalemkul at vt.edu>
>         <mailto:jalemkul at vt.edu <mailto:jalemkul at vt.edu>>
>                <mailto:jalemkul at vt.edu <mailto:jalemkul at vt.edu>
>         <mailto:jalemkul at vt.edu <mailto:jalemkul at vt.edu>>>> wrote:
> 
> 
> 
>                   Denny Frost wrote:
> 
>                       Here's what I've got:
> 
>                       M E G A - F L O P S   A C C O U N T I N G
> 
>                         RF=Reaction-Field  FE=Free Energy
>          SCFE=Soft-Core/Free
>                Energy
>                         T=Tabulated        W3=SPC/TIP3p    W4=TIP4p
>         (single or
>                pairs)
>                         NF=No Forces
> 
>                        Computing:                               M-Number
>                               M-Flops  % Flops
>                            
>          -----------------------------------------------------------------------------
>                        Coul(T) + VdW(T)                   1219164.751609
>                          82903203.109    80.6
>                        Outer nonbonded loop                 25980.879385
>                            259808.794     0.3
>                        Calc Weights                         37138.271040
>                           1336977.757     1.3
>                        Spread Q Bspline                    792283.115520
>                           1584566.231     1.5
>                        Gather F Bspline                    792283.115520
>                           4753698.693     4.6
>                        3D-FFT                              119163.856212
>                            953310.850     0.9
>                        Solve PME                             2527.465668
>                            161757.803     0.2
>                        NS-Pairs                             47774.705001
>                           1003268.805     1.0
>                        Reset In Box                           371.386080
>                              1114.158     0.0
>                        Shift-X                              24758.847360
>                            148553.084     0.1
>                        CG-CoM                                1237.953600
>                              3713.861     0.0
>                        Angles                               18569.135520
>                           3119614.767     3.0
>                        Propers                              14855.308416
>                           3401865.627     3.3
>                        Impropers                             3094.855920
>                            643730.031     0.6
>                        Virial                                1242.417375
>                             22363.513     0.0
>                        Stop-CM                               1237.953600
>                             12379.536     0.0
>                        P-Coupling                           12379.423680
>                             74276.542     0.1
>                        Calc-Ekin                            12379.436160
>                            334244.776     0.3
>                        Lincs                                11760.476208
>                            705628.572     0.7
>                        Lincs-Mat                           245113.083072
>                            980452.332     1.0
>                        Constraint-V                         23520.928704
>                            188167.430     0.2
>                        Constraint-Vir                       11760.452496
>                            282250.860     0.3
>                            
>          -----------------------------------------------------------------------------
>                        Total                                            
>                         102874947.133   100.0
>                            
>          -----------------------------------------------------------------------------
> 
> 
>                           R E A L   C Y C L E   A N D   T I M E   A C C
>         O U N T
>                I N G
> 
>                        Computing:         Nodes     Number     G-Cycles
>                   Seconds     %
>                            
>          -----------------------------------------------------------------------
>                        Neighbor search    1      99195     8779.027    
>         3300.3
>                      3.8
>                        Force                   1     991941   188562.885
>                   70886.8           81.7
>                        PME mesh           1     991941    18012.830    
>         6771.6
>                     7.8
>                        Write traj.             1            41    
>         16.835                 6.3
>                                0.0
>                        Update                 1     991941     2272.379
>                     854.3              1.0
>                        Constraints           1     991941    11121.146  
>                  4180.8     4.8
>                        Rest                     1                  
>          2162.628                    813.0      0.9
>                            
>          -----------------------------------------------------------------------
>                        Total                    1                
>          230927.730                  86813.1   100.0
>                            
>          -----------------------------------------------------------------------
>                            
>          -----------------------------------------------------------------------
>                        PME spread/gather      1    1983882    
>          17065.384           6415.4   7.4
>                        PME 3D-FFT               1    1983882    
>          503.340              189.2
>                           0.2
>                        PME solve                  1     991941      
>         427.136                     160.6     0.2
>                            
>          -----------------------------------------------------------------------
> 
>                       Does that mean it's only using 1 node?  That would
>                explain the
>                       speed issues.
> 
> 
>                   That's what it looks like to me.
> 
> 
>                   -Justin
> 
>                   --     ========================================
> 
>                   Justin A. Lemkul
>                   Ph.D. Candidate
>                   ICTAS Doctoral Scholar
>                   MILES-IGERT Trainee
>                   Department of Biochemistry
>                   Virginia Tech
>                   Blacksburg, VA
>                   jalemkul[at]vt.edu <http://vt.edu> <http://vt.edu>
>         <http://vt.edu> | (540)
> 
>                231-9080
> 
>                   http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin
> 
>                   ========================================
>                   --     gmx-users mailing list    gmx-users at gromacs.org
>         <mailto:gmx-users at gromacs.org>
>                <mailto:gmx-users at gromacs.org <mailto:gmx-users at gromacs.org>>
>                   <mailto:gmx-users at gromacs.org
>         <mailto:gmx-users at gromacs.org> <mailto:gmx-users at gromacs.org
>         <mailto:gmx-users at gromacs.org>>>
> 
> 
>                   http://lists.gromacs.org/mailman/listinfo/gmx-users
>                   Please search the archive at
>                   http://www.gromacs.org/Support/Mailing_Lists/Search before
>                posting!
>                   Please don't post (un)subscribe requests to the list.
>         Use the www
>                   interface or send it to gmx-users-request at gromacs.org
>         <mailto:gmx-users-request at gromacs.org>
>                <mailto:gmx-users-request at gromacs.org
>         <mailto:gmx-users-request at gromacs.org>>
>                   <mailto:gmx-users-request at gromacs.org
>         <mailto:gmx-users-request at gromacs.org>
>                <mailto:gmx-users-request at gromacs.org
>         <mailto:gmx-users-request at gromacs.org>>>.
> 
>                   Can't post? Read
>         http://www.gromacs.org/Support/Mailing_Lists
> 
> 
> 
>            --     ========================================
> 
>            Justin A. Lemkul
>            Ph.D. Candidate
>            ICTAS Doctoral Scholar
>            MILES-IGERT Trainee
>            Department of Biochemistry
>            Virginia Tech
>            Blacksburg, VA
>            jalemkul[at]vt.edu <http://vt.edu> <http://vt.edu> | (540)
>         231-9080
>            http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin
> 
>            ========================================
>            --     gmx-users mailing list    gmx-users at gromacs.org
>         <mailto:gmx-users at gromacs.org>
>            <mailto:gmx-users at gromacs.org <mailto:gmx-users at gromacs.org>>
>            http://lists.gromacs.org/mailman/listinfo/gmx-users
>            Please search the archive at
>            http://www.gromacs.org/Support/Mailing_Lists/Search before
>         posting!
>            Please don't post (un)subscribe requests to the list. Use the www
>            interface or send it to gmx-users-request at gromacs.org
>         <mailto:gmx-users-request at gromacs.org>
>            <mailto:gmx-users-request at gromacs.org
>         <mailto:gmx-users-request at gromacs.org>>.
>            Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> 
> 
> 
>     -- 
>     ========================================
> 
>     Justin A. Lemkul
>     Ph.D. Candidate
>     ICTAS Doctoral Scholar
>     MILES-IGERT Trainee
>     Department of Biochemistry
>     Virginia Tech
>     Blacksburg, VA
>     jalemkul[at]vt.edu <http://vt.edu> | (540) 231-9080
>     http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin
> 
>     ========================================
>     -- 
>     gmx-users mailing list    gmx-users at gromacs.org
>     <mailto:gmx-users at gromacs.org>
>     http://lists.gromacs.org/mailman/listinfo/gmx-users
>     Please search the archive at
>     http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
>     Please don't post (un)subscribe requests to the list. Use the www
>     interface or send it to gmx-users-request at gromacs.org
>     <mailto:gmx-users-request at gromacs.org>.
>     Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> 
> 

-- 
========================================

Justin A. Lemkul
Ph.D. Candidate
ICTAS Doctoral Scholar
MILES-IGERT Trainee
Department of Biochemistry
Virginia Tech
Blacksburg, VA
jalemkul[at]vt.edu | (540) 231-9080
http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin

========================================



More information about the gromacs.org_gmx-users mailing list