[gmx-users] Slow Runs

Denny Frost dsfrost at cableone.net
Fri Jan 28 21:43:47 CET 2011


I'm leaning toward the possibility that it is actually only running 8 copies
of the same job on different processors.  My question is how does gromacs4.5
know how many processors it has available to parallelize a job?  Is it
specified in grompp or does it just detect it?

On Fri, Jan 28, 2011 at 1:32 PM, Justin A. Lemkul <jalemkul at vt.edu> wrote:

>
>
> Denny Frost wrote:
>
>> Here's my grompp command:
>>
>> grompp_d -nice 0 -v -f md.mdp -c ReadyForMD.gro -o md.tpr -p top.top
>>
>> and my mdrun command is this:
>> time mpiexec mdrun_mpi -np 8 -cpt 30000 -nice 0 -nt 1 -s
>> $PBS_O_WORKDIR/md.tpr -o $PBS_O_WORKDIR/mdDone.trr -x
>> $PBS_O_WORKDIR/mdDone.xtc -c $PBS_O_WORKDIR/mdDone.gro -e
>> $PBS_O_WORKDIR/md.edr -g $PBS_O_WORKDIR/md.log 1> $PBS_JOBID.pgm.out 4>
>> $PBS_JOBID.pgm.err
>>
>>
> The -np option of mdrun is nonexistent, but mdrun does not check for proper
> command line arguments, so you won't get an error.  But then you've said
> that 8 processors are active, so I still suspect that mdrun was compiled
> incorrectly or in such a way that it's incompatible with your system.  The
> output from the .log file indicates that only one processor was used.  Maybe
> your admins can help you on this one, if the jobs spit out any useful
> diagnostic information.
>
> For our cluster, we use e.g.:
>
> mpirun -np 8 mdrun_mpi -deffnm md
>
> -Justin
>
>  I know the -cpt option is 30000 because I don't want a checkpoint file
>> because every time it tries to make it, it fails due to quota issues and
>> kills the job.  I'm not sure why this happens, but I think it's a separate
>> issue to take up with my supercomputing facility.
>>
>> On Fri, Jan 28, 2011 at 1:18 PM, Justin A. Lemkul <jalemkul at vt.edu<mailto:
>> jalemkul at vt.edu>> wrote:
>>
>>
>>
>>    Denny Frost wrote:
>>
>>        all 8 nodes are running at full capacity, though
>>
>>
>>    What is your mdrun command line?  How did you compile it?  What can
>>    happen is something went wrong during installation, so you think you
>>    have an MPI-enabled binary, but it is simply executing 8 copies of
>>    the same job.
>>
>>    -Justin
>>
>>        On Fri, Jan 28, 2011 at 1:13 PM, Justin A. Lemkul
>>        <jalemkul at vt.edu <mailto:jalemkul at vt.edu>
>>        <mailto:jalemkul at vt.edu <mailto:jalemkul at vt.edu>>> wrote:
>>
>>
>>
>>           Denny Frost wrote:
>>
>>               Here's what I've got:
>>
>>               M E G A - F L O P S   A C C O U N T I N G
>>
>>                 RF=Reaction-Field  FE=Free Energy  SCFE=Soft-Core/Free
>>        Energy
>>                 T=Tabulated        W3=SPC/TIP3p    W4=TIP4p (single or
>>        pairs)
>>                 NF=No Forces
>>
>>                Computing:                               M-Number
>>             M-Flops  % Flops
>>
>>  -----------------------------------------------------------------------------
>>                Coul(T) + VdW(T)                   1219164.751609
>>        82903203.109    80.6
>>                Outer nonbonded loop                 25980.879385
>>          259808.794     0.3
>>                Calc Weights                         37138.271040
>>         1336977.757     1.3
>>                Spread Q Bspline                    792283.115520
>>         1584566.231     1.5
>>                Gather F Bspline                    792283.115520
>>         4753698.693     4.6
>>                3D-FFT                              119163.856212
>>          953310.850     0.9
>>                Solve PME                             2527.465668
>>          161757.803     0.2
>>                NS-Pairs                             47774.705001
>>         1003268.805     1.0
>>                Reset In Box                           371.386080
>>            1114.158     0.0
>>                Shift-X                              24758.847360
>>          148553.084     0.1
>>                CG-CoM                                1237.953600
>>            3713.861     0.0
>>                Angles                               18569.135520
>>         3119614.767     3.0
>>                Propers                              14855.308416
>>         3401865.627     3.3
>>                Impropers                             3094.855920
>>          643730.031     0.6
>>                Virial                                1242.417375
>>           22363.513     0.0
>>                Stop-CM                               1237.953600
>>           12379.536     0.0
>>                P-Coupling                           12379.423680
>>           74276.542     0.1
>>                Calc-Ekin                            12379.436160
>>          334244.776     0.3
>>                Lincs                                11760.476208
>>          705628.572     0.7
>>                Lincs-Mat                           245113.083072
>>          980452.332     1.0
>>                Constraint-V                         23520.928704
>>          188167.430     0.2
>>                Constraint-Vir                       11760.452496
>>          282250.860     0.3
>>
>>  -----------------------------------------------------------------------------
>>                Total
>>       102874947.133   100.0
>>
>>  -----------------------------------------------------------------------------
>>
>>
>>                   R E A L   C Y C L E   A N D   T I M E   A C C O U N T
>>        I N G
>>
>>                Computing:         Nodes     Number     G-Cycles
>> Seconds     %
>>
>>  -----------------------------------------------------------------------
>>                Neighbor search    1      99195     8779.027     3300.3
>>              3.8
>>                Force                   1     991941   188562.885
>> 70886.8           81.7
>>                PME mesh           1     991941    18012.830     6771.6
>>             7.8
>>                Write traj.             1            41     16.835
>>         6.3
>>                        0.0
>>                Update                 1     991941     2272.379
>>   854.3              1.0
>>                Constraints           1     991941    11121.146
>>  4180.8     4.8
>>                Rest                     1                    2162.628
>>                813.0      0.9
>>
>>  -----------------------------------------------------------------------
>>                Total                    1                  230927.730
>>              86813.1   100.0
>>
>>  -----------------------------------------------------------------------
>>
>>  -----------------------------------------------------------------------
>>                PME spread/gather      1    1983882      17065.384
>>   6415.4   7.4
>>                PME 3D-FFT               1    1983882      503.340
>>      189.2
>>                   0.2
>>                PME solve                  1     991941       427.136
>>               160.6     0.2
>>
>>  -----------------------------------------------------------------------
>>
>>               Does that mean it's only using 1 node?  That would
>>        explain the
>>               speed issues.
>>
>>
>>           That's what it looks like to me.
>>
>>
>>           -Justin
>>
>>           --     ========================================
>>
>>           Justin A. Lemkul
>>           Ph.D. Candidate
>>           ICTAS Doctoral Scholar
>>           MILES-IGERT Trainee
>>           Department of Biochemistry
>>           Virginia Tech
>>           Blacksburg, VA
>>           jalemkul[at]vt.edu <http://vt.edu> <http://vt.edu> | (540)
>>
>>        231-9080
>>
>>           http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin
>>
>>           ========================================
>>           --     gmx-users mailing list    gmx-users at gromacs.org
>>        <mailto:gmx-users at gromacs.org>
>>           <mailto:gmx-users at gromacs.org <mailto:gmx-users at gromacs.org>>
>>
>>
>>           http://lists.gromacs.org/mailman/listinfo/gmx-users
>>           Please search the archive at
>>           http://www.gromacs.org/Support/Mailing_Lists/Search before
>>        posting!
>>           Please don't post (un)subscribe requests to the list. Use the
>> www
>>           interface or send it to gmx-users-request at gromacs.org
>>        <mailto:gmx-users-request at gromacs.org>
>>           <mailto:gmx-users-request at gromacs.org
>>        <mailto:gmx-users-request at gromacs.org>>.
>>
>>           Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>>
>>
>>    --     ========================================
>>
>>    Justin A. Lemkul
>>    Ph.D. Candidate
>>    ICTAS Doctoral Scholar
>>    MILES-IGERT Trainee
>>    Department of Biochemistry
>>    Virginia Tech
>>    Blacksburg, VA
>>    jalemkul[at]vt.edu <http://vt.edu> | (540) 231-9080
>>    http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin
>>
>>    ========================================
>>    --     gmx-users mailing list    gmx-users at gromacs.org
>>    <mailto:gmx-users at gromacs.org>
>>    http://lists.gromacs.org/mailman/listinfo/gmx-users
>>    Please search the archive at
>>    http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
>>    Please don't post (un)subscribe requests to the list. Use the www
>>    interface or send it to gmx-users-request at gromacs.org
>>    <mailto:gmx-users-request at gromacs.org>.
>>    Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>>
>>
> --
> ========================================
>
> Justin A. Lemkul
> Ph.D. Candidate
> ICTAS Doctoral Scholar
> MILES-IGERT Trainee
> Department of Biochemistry
> Virginia Tech
> Blacksburg, VA
> jalemkul[at]vt.edu | (540) 231-9080
> http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin
>
> ========================================
> --
> gmx-users mailing list    gmx-users at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> Please don't post (un)subscribe requests to the list. Use the www interface
> or send it to gmx-users-request at gromacs.org.
> Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20110128/cf8c0d70/attachment.html>


More information about the gromacs.org_gmx-users mailing list