[gmx-users] Looking for the source of the H-REMD slowdown when decoupling a lot of atoms

Sat Apr 2 13:29:07 CEST 2016

Dear Nicolas and Mark:

Indeed, I have spent the last 2 days testing the plumed Hrex code for gromacs 5.1.2. This is likely the way that I will go. I was looking for an alternate route because I am cautious of bugs and I thought that if I could do it easily in tested gromacs + a bit of code that I modified myself then I could trust it more than plugin whose size is epic enough that I will never read all of the new code. Whether plumed knows about something like cmap (etc., not to get caught up in details here) takes time to test.

Thank you both for your suggestions, I will move forward with plumed. 

Chris.

________________________________________
From: gromacs.org_gmx-users-bounces at maillist.sys.kth.se <gromacs.org_gmx-users-bounces at maillist.sys.kth.se> on behalf of Nicolas Cheron <nicolas.cheron.boulot at gmail.com>
Sent: 02 April 2016 06:18
To: gmx-users at gromacs.org
Subject: Re: [gmx-users] Looking for the source of the H-REMD slowdown when decoupling a lot of atoms

See: https://groups.google.com/forum/#!topic/plumed-users/eJ0xpnHPb_s and
https://github.com/GiovanniBussi/plumed2/tree/v2.2-hrex

Nicolas

2016-04-02 12:09 GMT+02:00 Mark Abraham <mark.j.abraham at gmail.com>:

> Hi,
>
> On Fri, Apr 1, 2016 at 11:39 PM Christopher Neale <
> chris.neale at alum.utoronto.ca> wrote:
>
> > Dear developers:
> >
> > Is it possible with only a small amount of work to modify calc_delta() in
> > src/programs/mdrun/repl_ex.cpp to add another case to re->type such that
> > the program sends the coordinates to the alternate .tpr information and
> > evaluates the energy completely?
>
>
> This is approximately what the implementation of plumed does.
>
> This would allow for arbitrary exchanges, including bizarre things like
> > exchanging the cutoff distance or whatever (not what I want to do -- I'm
> > just emphasizing the generality of this approach). It will certainly cost
> > some communication and compute cycles to do this, but for e.g. I
> currently
> > have a 12x slowdown using the free energy architecture while decoupling
> 5K
> > atoms. So if I do an exchange attempt every 500 steps, we're still
> breaking
> > even if this complete energy re-evaluation takes 5,500x longer than a
> > regular integration step -- and I don't see how it could possibly take
> > anywhere near that long assuming bMultiEx==FALSE.
> >
>
> One can also implement such schemes as a script that calls grompp and mdrun
> -rerun before relaunching whatever simulation. Horrible, but probably
> better.
>
> Possible problems that I can imagine are:
> > (a) the functions for a complete re-evaluation don't exist (either the
> > communication of coordinates or the energy evaluation)
> >
>
> The code's not modular enough for that, yet.
>
> (b) some issues with changes in temperature or pressure
> >
>
> Yes, such state is not well organised.
>
> (c) the coordinates that are required to do this evaluation no longer exist
> > because the energy evaluation functions are coupled to the timestep so we
> > really need to pass them the coordinates of the previous timestep.
> >
>
> No, those are separate stages, because there are multiple parts of the code
> computing forces.
>
> Any suggestions are really appreciated
> >
>
> Updating plumed for 5.1 is easily the path of least resistance, if it can
> do the job.
>
> Mark
>
> Thank you,
> > Chris.
> >
> > ________________________________________
> > From: gromacs.org_gmx-users-bounces at maillist.sys.kth.se <
> > gromacs.org_gmx-users-bounces at maillist.sys.kth.se> on behalf of Michael
> > Shirts <mrshirts at gmail.com>
> > Sent: 31 March 2016 14:00
> > To: Discussion list for GROMACS users
> > Subject: Re: [gmx-users] Looking for the source of the H-REMD slowdown
> > when decoupling a lot of atoms
> >
> > > One more question: why does the free energy code have to use its own
> > kernel? I realize that I'm going to sound like an idiot here, but why
> can't
> > one just tweak parameters outside of the kernel and then have the
> optimized
> > kernel do the dynamics? I presume that one has to step outside of the
> > kernel to do the replica exchanges, so why can't the code just use the
> > optimized kernel to do the dynamics between exchanges?
> >
> > The optimized energy code only codes very specific interactions;
> > vanilla Coulomb, and Lennard-Jones.  If it coded the more general
> > interactions (such as soft-core) it would be significantly slower.
> >
> > We've come up with some ways to handle free energy calculations with
> > optimized inner loops (I think we're talking about the same thing Mark
> > -- though we should coordinate to be sure), but that will take a
> > little time.
> >
> > One alternate possibility would be to create N replicas of the system,
> > with slightly different parameters, but where all systems use the
> > standard functional form, and then combine the results internall.  For
> > something like REST2, then this could work, since one is only linearly
> > scaling the interactions between two parts A and B of the system; so
> > it would be possible to represent each system as a separate physical
> > system. This could work fairly well for small numbers of replicas,
> > though might have some problems with large numbers of replicas. If
> > each replica was an entire new system, it could make replica exchange
> > quite slow, since calculating the energies of a given configuration
> > with a different replica's energy function would be very costly.
> >
> > Right now, for alchemical interactions, there are only two
> > representations of the system, and then lambda interpolates between
> > the two representations of the systems to create the alchemical
> > intermediates. Changing the way the code is structured would again
> > require some significant changes and time.
> >
> > On Thu, Mar 31, 2016 at 10:49 AM, Christopher Neale
> > <chris.neale at alum.utoronto.ca> wrote:
> > > Dear Szilárd, Mark, and Michael:
> > >
> > > Thank you for your suggestions.
> > >
> > > 1. I did use the icc compiler
> > > 2. I tested usage of nstlist=10 by setting verlet-buffer-tolerance=-1
> (I
> > had already set nstlist=10 but mdrun changes that to 25 in my previous
> > setup) -- this did not improve the performance
> > > 3. I tested increasing .mdp option nstdhdl from 200 to 20,000 both with
> > and without also increasing the mdrun -replex option from 200 to 20,000
> --
> > this did not improve the performance
> > > 4. I tried to test more than one openMP thread per rank. I am fuzzy on
> > this part of the usage, but I tried "ibrun -np 4 gmx_mpi mdrun -ntomp 6"
> > and it was a lot slower (I think it only used 4 cores?); I also tried as
> > above but with also -ntmpi 4 and it errored with a message about not
> being
> > able to set thread-MPI since I didn't compile with it. In any event, I
> > doubted that an optimization here would get back most of the 12x
> > performance loss so I didn't pursue this further.
> > >
> > > The good news is that there is a beta version of the H-REMD fork of
> > Plumed that works with gromacs 5.1, so I have a viable route for now.
> > >
> > > One more question: why does the free energy code have to use its own
> > kernel? I realize that I'm going to sound like an idiot here, but why
> can't
> > one just tweak parameters outside of the kernel and then have the
> optimized
> > kernel do the dynamics? I presume that one has to step outside of the
> > kernel to do the replica exchanges, so why can't the code just use the
> > optimized kernel to do the dynamics between exchanges?
> > >
> > > Thank you again for all of your help,
> > > Chris.
> > >
> > > ________________________________________
> > > From: gromacs.org_gmx-users-bounces at maillist.sys.kth.se <
> > gromacs.org_gmx-users-bounces at maillist.sys.kth.se> on behalf of Szilárd
> > Páll <pall.szilard at gmail.com>
> > > Sent: 30 March 2016 12:59
> > > To: Discussion list for GROMACS users
> > > Subject: Re: [gmx-users] Looking for the source of the H-REMD slowdown
> > when decoupling a lot of atoms
> > >
> > > Chris,
> > >
> > > I can suggest two possible tweaks, these won't do miracles, but may
> give
> > > you a bit better performance.
> > >
> > > * icc is better than gcc at optimizing the naiive free energy kernel
> > code.
> > > I observed in the past up to 1.5x faster free energy kernel performance
> > > with icc 15 vs gcc 4.9 or so.
> > >
> > > * The default nstlist heuristic/suggestion assumes fast non-bondeds. In
> > > your case, depending on how long is the list buffer with the nstlist
> you
> > > use (25?), you may be able to gain a bit of performance by shifting
> load
> > > back to the search with a smaller nstlist, e.g. 10.
> > >
> > > These two combined could give in best case 2x, I guess.
> > >
> > > What I find slightly unusual in the logs you posted on redmine is the
> > 3-4x
> > > slowdown in PME and constraints (I'd expect ~2x in PME and less in
> > > constraints), but it could be that this too is simply because most
> > > interaction in the system are perturbed which trigger non-optimized
> > > code-paths.
> > >
> > > Cheers,
> > > --
> > > Szilárd
> > >
> > > On Wed, Mar 30, 2016 at 3:05 PM, Mark Abraham <
> mark.j.abraham at gmail.com>
> > > wrote:
> > >
> > >> Hi,
> > >>
> > >> Yeah, unfortunately Michael's pretty much right - the free-energy
> > kernel is
> > >> currently that cousin nobody talks about (or to). It's essentially
> > >> unchanged since GROMACS 4.0 days, except that the Verlet scheme has
> some
> > >> kludge so that it can call the same kernel that the group scheme used
> to
> > >> call. But it has none of the optimizations and also some simplifying
> > >> pessimizations. The result is highly unsuitable for Chris's use case,
> > but
> > >> not horrible for the more normal case of perturbing a small part of a
> > >> system. For now, I can only suggest trying a run with more than one
> > OpenMP
> > >> thread per rank, but there's nothing in the log file snippets that
> > fills me
> > >> with any hope that it would be noticeably faster.
> > >>
> > >> We have a pile of infrastructure built and in the works that will lead
> > to
> > >> being able to offer similarly optimized free-energy kernels, but they
> > won't
> > >> see the light of day until next year, I'm afraid. A set of sample
> .tprs
> > at
> > >> Redmine 742 would be most welcome, however - it's very good for us to
> > know
> > >> when/that we're optimizing a workflow someone actually wants to run,
> but
> > >> currently has reason to avoid.
> > >>
> > >> Mark
> > >>
> > >> On Wed, Mar 30, 2016 at 8:05 AM Michael Shirts <mrshirts at gmail.com>
> > wrote:
> > >>
> > >> > Hi, Chris: I'm pretty sure that it's because the nonbonded free
> > >> > energies are much slower than the standard free energies.  You
> state:
> > >> >
> > >> > > I took a look at gmxlib/nonbonded/nb_free_energy.c in v.5.1.2,
> but I
> > >> was
> > >> > unable to find a function called "gmx_waste_time_here()" and beyond
> > that
> > >> I
> > >> > was out of my depth.
> > >> >
> > >> > But it's much more the fact that the non-free energy nonbondeds are
> > >> > SUPER optimized.
> > >> >
> > >> > I don't see a particularly viable way around this for now. The only
> > >> > thing I can think of splitting the neighborlists into two force
> calls,
> > >> > and scaling the forces and energies coming out of those.  That would
> > >> > be a huge pain.
> > >> >
> > >> > On Tue, Mar 29, 2016 at 9:50 PM, Christopher Neale
> > >> > <chris.neale at alum.utoronto.ca> wrote:
> > >> > > Dear Users:
> > >> > >
> > >> > > I am trying to do some Hamiltonian replica exchange (H-REMD) in
> > gromacs
> > >> > 5.1.2 and am running up against really large slowdowns when
> > decoupling a
> > >> > large number of atoms. I am decoupling 5360 atoms out of the 15520
> > atoms
> > >> in
> > >> > my system. The goal is not to get a PMF, but to enhance sampling
> using
> > >> the
> > >> > REST approach to partially decouple lipids in a bilayer. This
> approach
> > >> > enhances lipid relaxation times (
> > >> > http://pubs.acs.org/doi/pdf/10.1021/ct500305u ) though the authors
> of
> > >> > that paper modified the gromacs code to do their own H-REMD in order
> > to
> > >> > avoid the really slow speed they also got when decoupling lots of
> > atoms
> > >> via
> > >> > the free energy code.
> > >> > >
> > >> > > I have already posted details here
> > >> http://redmine.gromacs.org/issues/742
> > >> > , which includes .mdp options and some timing output. I compare the
> > >> timing
> > >> > output to a standard temperature REMD (T-REMD) run. For my usage,
> the
> > >> > slowdown is about 12x for H-REMD vs. T-REMD.
> > >> > >
> > >> > > I am motivated to find a solution within gromacs because the
> > >> alternative
> > >> > is to use gromacs 4.6.7 with plumed (or with the aforementioned
> > modified
> > >> > code, which is also gromacs v4). Normally that would be a viable
> > option,
> > >> > but I am using the charmm force field and the charmm TIP3P water
> and I
> > >> > would rather not give up the speed boost that I see in gromacs
> v5.1.2,
> > >> > which allows the use of the verlet cutoff scheme and has been tested
> > and
> > >> > shown to give the correct reproduction of charmm forces (vs. the
> > forces
> > >> one
> > >> > would get using the charmm simulation software).
> > >> > >
> > >> > > I took a look at gmxlib/nonbonded/nb_free_energy.c in v.5.1.2,
> but I
> > >> was
> > >> > unable to find a function called "gmx_waste_time_here()" and beyond
> > that
> > >> I
> > >> > was out of my depth.
> > >> > >
> > >> > > Thank you for any pointers,
> > >> > > Chris.
> > >> > >
> > >> > >
> > >> > > --
> > >> > > Gromacs Users mailing list
> > >> > >
> > >> > > * Please search the archive at
> > >> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > >> > posting!
> > >> > >
> > >> > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > >> > >
> > >> > > * For (un)subscribe requests visit
> > >> > >
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
> > or
> > >> > send a mail to gmx-users-request at gromacs.org.
> > >> > --
> > >> > Gromacs Users mailing list
> > >> >
> > >> > * Please search the archive at
> > >> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > >> > posting!
> > >> >
> > >> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > >> >
> > >> > * For (un)subscribe requests visit
> > >> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
> or
> > >> > send a mail to gmx-users-request at gromacs.org.
> > >> >
> > >> --
> > >> Gromacs Users mailing list
> > >>
> > >> * Please search the archive at
> > >> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > >> posting!
> > >>
> > >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > >>
> > >> * For (un)subscribe requests visit
> > >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > >> send a mail to gmx-users-request at gromacs.org.
> > >>
> > > --
> > > Gromacs Users mailing list
> > >
> > > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > posting!
> > >
> > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > >
> > > * For (un)subscribe requests visit
> > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > send a mail to gmx-users-request at gromacs.org.
> > > --
> > > Gromacs Users mailing list
> > >
> > > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > posting!
> > >
> > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > >
> > > * For (un)subscribe requests visit
> > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > send a mail to gmx-users-request at gromacs.org.
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > send a mail to gmx-users-request at gromacs.org.
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > send a mail to gmx-users-request at gromacs.org.
> >
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>
--
Gromacs Users mailing list

* Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.