[gmx-users] FEP and loss of performance

Wed Apr 6 17:36:44 CEST 2011

I posted my test files in: 
https://www.dropbox.com/link/17.-sUcJyMeEL?k=0f3b6fa098389405e7e15c886dcc83c1
This is a run for a dialanine peptide in a water box.
The cell side cubic box was 40 A.
The directory is organized as :
TEST\
        topol.top
	Run-00/confout.gro    ; Equilibrated structure
	Run-00/state.cp

	MD-std/Commands ; commands to run the simulation , grompp and mdrun
        MD-std/md.mdp

        MD-FEP/Commands
        MD-FEP/md.mdp

~700 kb


> David Mobley wrote:
> > Hi,
> > 
> > This doesn't sound like normal behavior. In fact, this is not what I
> > typically observe. While there may be a small performance difference,
> > it is probably at the level of a few percent. Certainly not a factor
> > of more than 10.
> 
> I see about a 50% reduction in speed when decoupling small molecules in
> water. For me, I don't care if a nanosecond takes 2 or 3 hours.  For
> larger systems such as the ones considered here, it seems that the
> performance loss is much more dramatic.
> 
> I can reproduce the poor performance with a simple water box with the free
> energy code on.  Decoupling the whole system (or at least, a large part of
> it, as was the original intent of this thread, as I understand it) results
> in a 1500% slowdown.  Some observations:
> 
> 1. Water optimizations are turned off when decoupling the water, but this
> only accounts for 20% of the slowdown, which is relatively insignificant.
> 
> 2. Using lambda=0.9 (from a previous post) in my water box results in even
> worse performance, but much of this is due to DD instability.  The system
> I used has a few hundred water molecules in it, and after about 10-12 ps,
> they collapse in on one another and form clusters, dramatically shifting
> the balance of atoms between DD cells.  DLB gets activated but the force
> imbalances are around 40%, and the total slowdown (relative to
> non-perturbed trajectories) is 2000%.
> 
> 3. Using lambda=0 results in stable trajectories with very low imbalance,
> but also poor performance.  It seems that mdrun spends all of its time in
> the free energy innerloops:
> 
>   Computing:                               M-Number         M-Flops  %
> Flops
> --------------------------------------------------------------------------
> --- Free energy innerloop                19064.187513     2859628.127   
> 89.1 Outer nonbonded loop                   325.153806        3251.538    
> 0.1 Calc Weights                           231.754635        8343.167    
> 0.3 Spread Q Bspline                      9888.197760       19776.396    
> 0.6 Gather F Bspline                      9888.197760       59329.187    
> 1.8 3D-FFT                               24406.688124      195253.505    
> 6.1 Solve PME                              485.109702       31047.021    
> 1.0 NS-Pairs                               521.616615       10953.949    
> 0.3 Reset In Box                             2.575515           7.727    
> 0.0 CG-CoM                                   7.728090          23.184    
> 0.0 Virial                                   8.176635         147.179    
> 0.0 Update                                  77.251545        2394.798    
> 0.1 Stop-CM                                  0.774045           7.740    
> 0.0 Calc-Ekin                               77.253090        2085.833    
> 0.1 Constraint-V                            77.253090         618.025    
> 0.0 Constraint-Vir                           7.726545         185.437    
> 0.0 Settle                                  51.502060       16635.165    
> 0.5
> --------------------------------------------------------------------------
> --- Total                                                 3209687.978  
> 100.0
> --------------------------------------------------------------------------
> ---
> 
> > You may want to provide an mdp file and topology, etc. so someone can
> > see if they can reproduce your problem.
> 
> I agree that would be useful.  I can contribute my water box system if it
> would help, as well.
> 
> -Justin
> 
> > Thanks.
> > 
> > On Wed, Apr 6, 2011 at 7:59 AM, Luca Bellucci <lcbllcc at gmail.com> wrote:
> >> I followed your suggestions and i tried to perform a MD run wit GROMACS
> >> and NAMD for dialanine peptide in a water box. The cell side cubic box
> >> was 40 A.
> >> 
> >> GROMACS:
> >> With the free energy module there is a drop in gromacs performance of
> >> about 10/20 fold.
> >> Standard MD:      Time:          6.693       6.693    100.0
> >> Free energy MD:   Time:    136.113    136.113    100.0
> >> 
> >> NAMD:
> >> With free energy module there is not a  drop in performance so evident
> >> as in gromacs.
> >> Standard MD   6.900000
> >> Free energy MD 9.600000
> >> 
> >> I would like to point out that this kind of calculation is common, in
> >> fact in the manual of gromacs 4.5.3 it is reported  " There is a
> >> special option system that couples all molecules types in the system.
> >> This can be useful for equilibrating a system [..] ".
> >> 
> >> Actually, I would understand if there is a solution to resolve the drop
> >> in gromacs performance for this kind of calculation.
> >> 
> >> Luca
> >> 
> >>> I don't know if it is possible or not. I think that you can enhance
> >>> your chances of developer attention if you develop a small and simple
> >>> test system that reproduces the slowdown and very explicitly state
> >>> your case for why you can't use some other method. I would suggest
> >>> posting that to the mailing list and, if you don't get any response,
> >>> post it as an enhancement request on the redmine page (or whatever has
> >>> taken over from bugzilla).
> >>> 
> >>> Good luck,
> >>> Chris.
> >>> 
> >>> -- original message --
> >>> 
> >>> 
> >>> Yes i am testing the possibility to perform an Hamiltonian-REMD
> >>> Energy barriers can be overcome  increasing the temperature system or
> >>> scaling potential energy  with a lambda value, these methods are
> >>> "equivalent". Both have advantages and disavantages, at this stage it
> >>> is not the right place to debate on it. The main problem seems to be
> >>> how to overcome to the the loss of gromacs performance in such
> >>> calculation.  At this moment it seems an intrinsic code problem.
> >>> Is it possible?
> >>> 
> >>>>  >> Dear Chris and Justin
> >>>>>> 
> >>>>>> /  Thank you for your precious suggestions
> >>>> 
> >>>> />>/  This is a test that i perform in a single machine with 8 cores
> >>>> />>/  and gromacs 4.5.4.
> >>>> />>/
> >>>> />>/  I am trying  to enhance the  sampling of a protein using the
> >>>> decoupling scheme />>/  of the free energy module of gromacs.  However
> >>>> when i decouple only the />>/  protein, the protein collapsed. Because
> >>>> i simulated in NVT i thought that />>/  this was an effect of the
> >>>> solvent. I was trying to decouple also the solvent />>/  to
> >>>> understand the system behavior.
> >>>> />>/
> >>>> />
> >>>> 
> >>>>> Rather than suspect that the solvent is the problem, it's more likely
> >>>>> that decoupling an entire protein simply isn't stable.  I have never
> >>>>> tried
> >>>>> 
> >>>>> anything that enormous, but the volume change in the system could be
> >>>>> unstable, along with any number of factors, depending on how you
> >>>>> approach it.
> >>>>> 
> >>>>> If you're looking for better sampling, REMD is a much more robust
> >>>>> approach
> >>>>> 
> >>>>> than trying to manipulate the interactions of huge parts of your
> >>>>> system using the free energy code.
> >>>> 
> >>>> Presumably Luca is interested in some type of hamiltonian exchange
> >>>> where lambda represents the interactions between the protein and the
> >>>> solvent? This can actually be a useful method for enhancing sampling.
> >>>> I think it's dangerous if we rely to heavily on "try something else".
> >>>> I still see no methodological reason a priori why there should be any
> >>>> actual slowdown, so that makes me think that it's an implementation
> >>>> thing, and there is at least the possibility that this is something
> >>>> that could be fixed as an enhancement.
> >>>> 
> >>>> Chris.
> >>>> 
> >>>> 
> >>>> -Justin
> >>>> 
> >>>>> /   I expected a loss of performance, but not so drastic.
> >>>> 
> >>>> />/  Luca
> >>>> />/
> >>>> />>/  Load balancing problems I can understand, but why would it take
> >>>> longer />>/  in absolute time? I would have thought that some nodes
> >>>> would simple be />>/  sitting idle, but this should not cause an
> >>>> increase in the overall />>/  simulation time (15x at that!).
> >>>> />>/
> >>>> />>/  There must be some extra communication?
> >>>> />>/
> >>>> />>/  I agree with Justin that this seems like a strange thing to do,
> >>>> but />>/  still I think that there must be some underlying coding
> >>>> issue (probably />>/  one that only exists because of a reasonable
> >>>> assumption that nobody />>/  would annihilate the largest part of
> >>>> their system). />>/
> >>>> />>/  Chris.
> >>>> />>/
> >>>> />>/  Luca Bellucci wrote:
> >>>> />>>/  /  Hi Chris,
> >>>> />>/  />/  thank for the suggestions,
> >>>> />>/  />/  in the previous mail there is a mistake because
> >>>> />>/  />/  couple-moltype = SOL (for solvent) and not
> >>>> "Protein_chaim_P". />>/  />/  Now the problem of the load balance
> >>>> seems reasonable, because />>/  />/  the water box is large ~9.0 nm.
> >>>> />>/  /
> >>>> />>/  Now your outcome makes a lot more sense.  You're decoupling all
> >>>> of the />>/  solvent? I don't see how that is going to be physically
> >>>> stable or terribly /
> >> 
> >> --
> >> gmx-users mailing list    gmx-users at gromacs.org
> >> http://lists.gromacs.org/mailman/listinfo/gmx-users
> >> Please search the archive at
> >> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> >> Please don't post (un)subscribe requests to the list. Use the
> >> www interface or send it to gmx-users-request at gromacs.org.
> >> Can't post? Read http://www.gromacs.org/Support/Mailing_Lists