[gmx-users] Efficiently running multiple simulations

Thu Sep 17 21:41:14 CEST 2015

Hi Mark,

I am not sure that there is a problem with the MPI system or job scheduler. When I run 8 simulations with 2 threads and log onto the node, I can see that every core is being used equally.

Regards,
-Maxwell

________________________________________
From: gromacs.org_gmx-users-bounces at maillist.sys.kth.se <gromacs.org_gmx-users-bounces at maillist.sys.kth.se> on behalf of Mark Abraham <mark.j.abraham at gmail.com>
Sent: Thursday, September 17, 2015 12:18 PM
To: gmx-users at gromacs.org; gromacs.org_gmx-users at maillist.sys.kth.se
Subject: Re: [gmx-users] Efficiently running multiple simulations

Hi,

On Thu, Sep 17, 2015 at 6:24 PM Zimmerman, Maxwell <mizimmer at wustl.edu>
wrote:

> Hi Mark,
>
> Thank you for reviewing the log files! I will recompile GROMACS for
> AVX2_256 SIMD to efficiently use the CPUs.
>
> The number of cores on the node is 16 and each core has two threads. By
> default, GROMACS picked 4 threads per simulation. I reran the simulation by
> specifying 2 threads per simulation and am providing a link to the new log
> file. I still see the same results, which is a lower performance per
> simulation when using 8 simulations on the node opposed to just 1.
>

OK. That's quite good evidence that (probably) your MPI system or job
scheduler is involving itself in the decision of how to lay out the 8 ranks
in a way that isn't working well. (The PME tuning is also still trying very
hard to shift workload off the CPU and onto the GPUs.)

For example, your MPI setup might put the 8 ranks on the first 8 cores, and
bind the ranks threads to those cores, and now the other 8 cores are unused
*and* the "second" OpenMP thread isn't getting great value if it has to be
the second thread on the same core. mpirun typically takes various options
to control how this works, so you should find out what default is working
for you (docs, sysadmins), and try some alternatives.

Since you have 16 real cores, you probably have 2 processor dies, each with
8 cores. You probably want 4 simulations per die, each with its threads on
adjacent cores. First, try each simulation with two OpenMP threads, so that
you have one single thread running per core. This should be a dramatic
performance improvement. Then try four OpenMP threads on those two cores,
which will probably improve things a bit further.

https://www.dropbox.com/s/rd3akpqcn3mo7c5/md_8GPUs_2Cores.log?dl=0
>
> My problem isn't that I am confusing "node" with "simulation", but rather
> that each of the 8 log files (each log file corresponds to a single
> simulation of the 8 that were run using "-multi") shows mapping to 8 GPUs.
> Is each log file telling me that the simulations are all aware of each
> other,

Yes. Each simulation is running on the same node, so each is reporting the
same properties about that node. mdrun -gpu_id maps PP ranks present on the
node to GPUs present on the node; in this case each PP rank comes from a
different simulation. The code that handles gmx mdrun_mpi -multi 8 knows
it's running 8 simulations, and observes they're all on the same node, so
it does the obvious thing.

or is each simulation trying to create 8 domains to split up on the GPUs?
>

No. Each log file clearly reports that its simulation has 1 MPI rank and
thus 1 domain. I've already said there's no way to split that 1 domain up.

Once you've got things working properly, you can experiment with mpirun -np
16 mdrun_mpi -multi 8 -gpu_id 0011223344556677 -ntomp 2 (or -ntomp 1) and
observe the difference in the log files.

I am still puzzled as to the loss in performance with increasing number of
> simulations on the node. I tried running 2 simulations with 2 GPUs and 4
> CPUs, 4 simulations with 4 GPUs and 8 CPUs, and 6 simulations with 6 GPUs
> and 12 CPUs. For the increasing number of simulations on the node, I see a
> greater loss in performance for individual simulations.
>

That's consistent with increasing contention for whatever subset of the CPU
cores is being used in practice. If you can log into the node
interactively, there are various performance monitoring tools that will
show you the occupancies of the CPU cores (and GPUs). Talk with your
sysadmins.

Mark

> Regards,
> -Maxwell
>
> _______________________________________
> From: gromacs.org_gmx-users-bounces at maillist.sys.kth.se <
> gromacs.org_gmx-users-bounces at maillist.sys.kth.se> on behalf of Mark
> Abraham <mark.j.abraham at gmail.com>
> Sent: Wednesday, September 16, 2015 4:44 PM
> To: gmx-users at gromacs.org; gromacs.org_gmx-users at maillist.sys.kth.se
> Subject: Re: [gmx-users] Efficiently running multiple simulations
>
> Hi,
>
> The log files tell you that you should compile for AVX2_256 SIMD for the
> Haswell CPUs you have. Do that. Your runs are wasting a fair chunk of the
> value of the CPU hardware, and your setup absolutely needs to extract every
> last drop from the CPUs. That means you need to follow the instructions in
> the GROMACS install guide, which suggest you use a recent compiler. Your
> GROMACS was compiled with gcc 4.4.7, which was about two years old before a
> Haswell was sold! Why HPC clusters buy the latest hardware and continue to
> default to the "stable" 5-year old compiler suite shipped with the
> "enterprise" distribution remains a total mystery to me. :-)
>
> The log file also says that your MPI system is starting four OpenMP threads
> per rank in the multi-simulation case, so the comparison is not valid.
> Starting 8*4 OpenMP threads on your node oversubscribes the actual cores,
> and this is terrible for GROMACS. You need to find out how many actual
> cores you have (each of which can have two hyperthreads, which is usually
> worth using on such Haswell machines). You want either one thread per core,
> or two threads per core (try both). If you don't know how many actual cores
> there are, consult your local docs/admins.
>
> "Mapping of GPUs to the 8 PP ranks in this node: #0, #1, #2, #3, #4, #5,
> #6, #7" is actually correct and unambiguous. There's 8 simulations, each
> with 1 domain, so 8 PP ranks, and each is mapped to one of 8 GPUs *in this
> node*. You've been reading "node" and thinking "simulation."
>
> Mark
>
>
> On Wed, Sep 16, 2015 at 9:23 PM Zimmerman, Maxwell <mizimmer at wustl.edu>
> wrote:
>
> > Hi Mark,
> >
> > Here are two links to .log files for running 1 simulation on 1 GPU and 2
> > CPUs and 8 simulations across all 8 GPUs and 16 CPUs respectively:
> >
> > https://www.dropbox.com/s/ko2l0qlr4kdpt51/md_1GPU.log?dl=0
> > https://www.dropbox.com/s/chtcv4nqxof64p8/md_8GPUs.log?dl=0
> >
> > Regards,
> > -Maxwell
> >
> >
> > ________________________________________
> > From: gromacs.org_gmx-users-bounces at maillist.sys.kth.se <
> > gromacs.org_gmx-users-bounces at maillist.sys.kth.se> on behalf of Mark
> > Abraham <mark.j.abraham at gmail.com>
> > Sent: Wednesday, September 16, 2015 1:39 PM
> > To: gmx-users at gromacs.org; gromacs.org_gmx-users at maillist.sys.kth.se
> > Subject: Re: [gmx-users] Efficiently running multiple simulations
> >
> > Hi,
> >
> > On Wed, Sep 16, 2015 at 5:46 PM Zimmerman, Maxwell <mizimmer at wustl.edu>
> > wrote:
> >
> > > Hi Mark,
> > >
> > > Thank you for the feedback.
> > >
> > > To ensure that I am making a proper comparison, I tried running:
> > > mpirun -np 1 mdrun_mpi -ntomp 2 -gpu_id 0 -pin on
> > > and I still see the same pattern; running a single simulation with 1
> GPU
> > > and 2 CPUs performs nearly twice as well as running 8 simulations with
> > > "-multi" using 8 GPUs and 16 CPUs.
> > >
> >
> > OK. In that case, please share some links to .log files on a file-sharing
> > service, so we might be able to see where the issue arises. The list does
> > not accept attachments.
> >
> > Just to clarify, when I use "-multi" all 8 of the .log files show that 8
> > > GPUs are selected for the run. If a single GPU were being used,
> wouldn't
> > it
> > > only show mapping to one GPU ID per .log file?
> > >
> >
> > I forget the details here, but organizing the mapping has to be done on a
> > per-node basis. It would not surprise me if the reporting was not
> strictly
> > valid on a per-simulation basis, but it ought to mention that the 8 GPUs
> > are asserts of the node, and not necessarily of the simulation.
> >
> > There is absolutely no way that any simulation with a single domain can
> > share 8 GPUs.
> >
> > Mark
> >
> >
> > > Regards,
> > > -Maxwell
> > >
> > > ________________________________________
> > > From: gromacs.org_gmx-users-bounces at maillist.sys.kth.se <
> > > gromacs.org_gmx-users-bounces at maillist.sys.kth.se> on behalf of Mark
> > > Abraham <mark.j.abraham at gmail.com>
> > > Sent: Wednesday, September 16, 2015 10:08 AM
> > > To: gmx-users at gromacs.org; gromacs.org_gmx-users at maillist.sys.kth.se
> > > Subject: Re: [gmx-users] Efficiently running multiple simulations
> > >
> > > Hi,
> > >
> > >
> > > On Wed, Sep 16, 2015 at 4:41 PM Zimmerman, Maxwell <mizimmer at wustl.edu
> >
> > > wrote:
> > >
> > > > Hi Mark,
> > > >
> > > > Sorry for the confusion, what I meant to say was that each node on
> the
> > > > cluster has 8 GPUs and 16 CPUs.
> > > >
> > >
> > > OK. Please note that "CPU" is ambiguous, so you should prefer not to
> use
> > it
> > > without clarification.
> > >
> > > Unless the GPUs are weak and the CPU is strong, 2 CPU cores per GPU
> will
> > > likely be under-powered for PME simulations in GROMACS.
> > >
> > > When I attempt to specify the GPU IDs for running 8 simulations on a
> node
> > > > using the "-multi" and "-gpu_id", each .log file has the following:
> > > >
> > > > "8 GPUs user-selected for this run.
> > > > Mapping of GPUs to the 8 PP ranks in this node: #0, #1, #2, #3, #4,
> #5,
> > > > #6, #7"
> > > >
> > > > This makes me think that each simulation is competing for each of the
> > > GPUs
> > >
> > >
> > > You are running 8 simulations, each of which has a single domain, each
> of
> > > which is mapped to a single PP rank, each of which is mapped to a
> > different
> > > single GPU. Perfect.
> > >
> > > explaining my performance loss per simulation compared to running 1
> > > > simulation on 1 GPU and 2 CPUs.
> > >
> > >
> > > Very likely you are not comparing with what you think you are, e.g. you
> > > need to compare with an otherwise empty node running something like
> > >
> > > mpirun -np 1 mdrun_mpi -ntomp 2 -gpu_id 0 -pin on
> > >
> > > so that you actually have a single process running on two pinned CPU
> > cores
> > > and a single GPU. This should be fairly comparable with the mdrun
> -multi
> > > setup
> > >
> > > A side-by-side diff of that log file and the log file of the 0th member
> > of
> > > the multi-sim should show very few differences until the simulation
> > starts,
> > > and comparable performance. If not, please share your .log files on a
> > > file-sharing service.
> > >
> > > If this interpretation is correct, is there a better way to pin each
> > > > simulation to a single GPU and 2 CPUs? If my interpretation is
> > incorrect,
> > > > is there a more efficient way to use the "-multi" option to match the
> > > > performance I see of running a single simulation * 8?
> > > >
> > >
> > > mdrun will handle all of that correctly if it hasn't been crippled by
> how
> > > the MPI library has organized life. You want it to assign ranks to
> cores
> > > that are close to each other and their matching GPU. That tends to be
> the
> > > default behaviour, but clusters intended for node sharing can do weird
> > > things. (It is not yet clear that any of this is a problem.)
> > >
> > > Mark
> > >
> > >
> > > > Regards,
> > > > -Maxwell
> > > >
> > > >
> > > > ________________________________________
> > > > From: gromacs.org_gmx-users-bounces at maillist.sys.kth.se <
> > > > gromacs.org_gmx-users-bounces at maillist.sys.kth.se> on behalf of Mark
> > > > Abraham <mark.j.abraham at gmail.com>
> > > > Sent: Wednesday, September 16, 2015 3:52 AM
> > > > To: gmx-users at gromacs.org; gromacs.org_gmx-users at maillist.sys.kth.se
> > > > Subject: Re: [gmx-users] Efficiently running multiple simulations
> > > >
> > > > Hi,
> > > >
> > > > I'm confused by your description of the cluster as having 8 GPUs and
> 16
> > > > CPUs. The relevant parameters are the number of GPUs and CPU cores
> per
> > > > node. See the examples at
> > > >
> > > >
> > >
> >
> http://manual.gromacs.org/documentation/5.1/user-guide/mdrun-features.html#running-multi-simulations
> > > >
> > > > Mark
> > > >
> > > > On Tue, Sep 15, 2015 at 11:38 PM Zimmerman, Maxwell <
> > mizimmer at wustl.edu>
> > > > wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > >
> > > > > I am having some troubles efficiently running simulations in
> parallel
> > > on
> > > > a
> > > > > gpu-cluster. The cluster has 8 GPUs and 16 CPUs. Currently, the
> > command
> > > > > that I am using is:
> > > > >
> > > > >
> > > > > mpirun -np 8 mdrun_mpi -multi 8 -nice 4 -s md -o md -c after_md -v
> -x
> > > > > frame -pin on
> > > > >
> > > > >
> > > > > Per-simulation, the performance I am getting with this command is
> > > > > significantly lower than running 1 simulation that uses 1 GPU and 2
> > > CPUs
> > > > > alone. This command seems to use all 8 GPUs and 16 CPUs on the 8
> > > parallel
> > > > > simulations, although I think this would be faster if I could pin
> > each
> > > > > simulation to a specific GPU and pair of CPUs. The -gpu_id option
> > does
> > > > not
> > > > > seem to change anything when I am using the mpirun. Is there a way
> > > that I
> > > > > can efficiently run the 8 simulations on the cluster by specifying
> > the
> > > > GPU
> > > > > and CPUs to run with each simulation?
> > > > >
> > > > >
> > > > > Thank you in advance!
> > > > >
> > > > >
> > > > > Regards,
> > > > >
> > > > > -Maxwell
> > > > > --
> > > > > Gromacs Users mailing list
> > > > >
> > > > > * Please search the archive at
> > > > > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > > > > posting!
> > > > >
> > > > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > > > >
> > > > > * For (un)subscribe requests visit
> > > > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
> > or
> > > > > send a mail to gmx-users-request at gromacs.org.
> > > > >
> > > > --
> > > > Gromacs Users mailing list
> > > >
> > > > * Please search the archive at
> > > > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > > > posting!
> > > >
> > > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > > >
> > > > * For (un)subscribe requests visit
> > > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
> or
> > > > send a mail to gmx-users-request at gromacs.org.
> > > > --
> > > > Gromacs Users mailing list
> > > >
> > > > * Please search the archive at
> > > > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > > > posting!
> > > >
> > > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > > >
> > > > * For (un)subscribe requests visit
> > > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
> or
> > > > send a mail to gmx-users-request at gromacs.org.
> > > >
> > > --
> > > Gromacs Users mailing list
> > >
> > > * Please search the archive at
> > > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > > posting!
> > >
> > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > >
> > > * For (un)subscribe requests visit
> > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > > send a mail to gmx-users-request at gromacs.org.
> > > --
> > > Gromacs Users mailing list
> > >
> > > * Please search the archive at
> > > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > > posting!
> > >
> > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > >
> > > * For (un)subscribe requests visit
> > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > > send a mail to gmx-users-request at gromacs.org.
> > >
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > send a mail to gmx-users-request at gromacs.org.
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > send a mail to gmx-users-request at gromacs.org.
> >
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>
--
Gromacs Users mailing list

* Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.