[gmx-users] FEP and loss of performance

Thu Apr 7 14:54:35 CEST 2011

I would suggest that you take Chris' advice and post all of this as a feature 
request on redmine.gromacs.org so that it can be put on a to-do list.  Enhancing 
the performance of the free energy code is probably going to be a low-priority, 
long-term goal (in the absence of any proven bug), but at least it won't get 
lost in the shuffle of the mailing list.  If there's no record of it in redmine, 
it likely won't get addressed.  Gromacs is undergoing major changes at the 
moment, so the core developers are quite busy with other priorities.

-Justin

Luca Bellucci wrote:
> I posted my test files in: 
> https://www.dropbox.com/link/17.-sUcJyMeEL?k=0f3b6fa098389405e7e15c886dcc83c1
> This is a run for a dialanine peptide in a water box.
> The cell side cubic box was 40 A.
> The directory is organized as :
> TEST\
>         topol.top
> 	Run-00/confout.gro    ; Equilibrated structure
> 	Run-00/state.cp
> 
> 	MD-std/Commands ; commands to run the simulation , grompp and mdrun
>         MD-std/md.mdp
> 
>         MD-FEP/Commands
>         MD-FEP/md.mdp
> 
> ~700 kb
> 
> 
>> David Mobley wrote:
>>> Hi,
>>>
>>> This doesn't sound like normal behavior. In fact, this is not what I
>>> typically observe. While there may be a small performance difference,
>>> it is probably at the level of a few percent. Certainly not a factor
>>> of more than 10.
>> I see about a 50% reduction in speed when decoupling small molecules in
>> water. For me, I don't care if a nanosecond takes 2 or 3 hours.  For
>> larger systems such as the ones considered here, it seems that the
>> performance loss is much more dramatic.
>>
>> I can reproduce the poor performance with a simple water box with the free
>> energy code on.  Decoupling the whole system (or at least, a large part of
>> it, as was the original intent of this thread, as I understand it) results
>> in a 1500% slowdown.  Some observations:
>>
>> 1. Water optimizations are turned off when decoupling the water, but this
>> only accounts for 20% of the slowdown, which is relatively insignificant.
>>
>> 2. Using lambda=0.9 (from a previous post) in my water box results in even
>> worse performance, but much of this is due to DD instability.  The system
>> I used has a few hundred water molecules in it, and after about 10-12 ps,
>> they collapse in on one another and form clusters, dramatically shifting
>> the balance of atoms between DD cells.  DLB gets activated but the force
>> imbalances are around 40%, and the total slowdown (relative to
>> non-perturbed trajectories) is 2000%.
>>
>> 3. Using lambda=0 results in stable trajectories with very low imbalance,
>> but also poor performance.  It seems that mdrun spends all of its time in
>> the free energy innerloops:
>>
>>   Computing:                               M-Number         M-Flops  %
>> Flops
>> --------------------------------------------------------------------------
>> --- Free energy innerloop                19064.187513     2859628.127   
>> 89.1 Outer nonbonded loop                   325.153806        3251.538    
>> 0.1 Calc Weights                           231.754635        8343.167    
>> 0.3 Spread Q Bspline                      9888.197760       19776.396    
>> 0.6 Gather F Bspline                      9888.197760       59329.187    
>> 1.8 3D-FFT                               24406.688124      195253.505    
>> 6.1 Solve PME                              485.109702       31047.021    
>> 1.0 NS-Pairs                               521.616615       10953.949    
>> 0.3 Reset In Box                             2.575515           7.727    
>> 0.0 CG-CoM                                   7.728090          23.184    
>> 0.0 Virial                                   8.176635         147.179    
>> 0.0 Update                                  77.251545        2394.798    
>> 0.1 Stop-CM                                  0.774045           7.740    
>> 0.0 Calc-Ekin                               77.253090        2085.833    
>> 0.1 Constraint-V                            77.253090         618.025    
>> 0.0 Constraint-Vir                           7.726545         185.437    
>> 0.0 Settle                                  51.502060       16635.165    
>> 0.5
>> --------------------------------------------------------------------------
>> --- Total                                                 3209687.978  
>> 100.0
>> --------------------------------------------------------------------------
>> ---
>>
>>> You may want to provide an mdp file and topology, etc. so someone can
>>> see if they can reproduce your problem.
>> I agree that would be useful.  I can contribute my water box system if it
>> would help, as well.
>>
>> -Justin
>>
>>> Thanks.
>>>
>>> On Wed, Apr 6, 2011 at 7:59 AM, Luca Bellucci <lcbllcc at gmail.com> wrote:
>>>> I followed your suggestions and i tried to perform a MD run wit GROMACS
>>>> and NAMD for dialanine peptide in a water box. The cell side cubic box
>>>> was 40 A.
>>>>
>>>> GROMACS:
>>>> With the free energy module there is a drop in gromacs performance of
>>>> about 10/20 fold.
>>>> Standard MD:      Time:          6.693       6.693    100.0
>>>> Free energy MD:   Time:    136.113    136.113    100.0
>>>>
>>>> NAMD:
>>>> With free energy module there is not a  drop in performance so evident
>>>> as in gromacs.
>>>> Standard MD   6.900000
>>>> Free energy MD 9.600000
>>>>
>>>> I would like to point out that this kind of calculation is common, in
>>>> fact in the manual of gromacs 4.5.3 it is reported  " There is a
>>>> special option system that couples all molecules types in the system.
>>>> This can be useful for equilibrating a system [..] ".
>>>>
>>>> Actually, I would understand if there is a solution to resolve the drop
>>>> in gromacs performance for this kind of calculation.
>>>>
>>>> Luca
>>>>
>>>>> I don't know if it is possible or not. I think that you can enhance
>>>>> your chances of developer attention if you develop a small and simple
>>>>> test system that reproduces the slowdown and very explicitly state
>>>>> your case for why you can't use some other method. I would suggest
>>>>> posting that to the mailing list and, if you don't get any response,
>>>>> post it as an enhancement request on the redmine page (or whatever has
>>>>> taken over from bugzilla).
>>>>>
>>>>> Good luck,
>>>>> Chris.
>>>>>
>>>>> -- original message --
>>>>>
>>>>>
>>>>> Yes i am testing the possibility to perform an Hamiltonian-REMD
>>>>> Energy barriers can be overcome  increasing the temperature system or
>>>>> scaling potential energy  with a lambda value, these methods are
>>>>> "equivalent". Both have advantages and disavantages, at this stage it
>>>>> is not the right place to debate on it. The main problem seems to be
>>>>> how to overcome to the the loss of gromacs performance in such
>>>>> calculation.  At this moment it seems an intrinsic code problem.
>>>>> Is it possible?
>>>>>
>>>>>>  >> Dear Chris and Justin
>>>>>>>> /  Thank you for your precious suggestions
>>>>>> />>/  This is a test that i perform in a single machine with 8 cores
>>>>>> />>/  and gromacs 4.5.4.
>>>>>> />>/
>>>>>> />>/  I am trying  to enhance the  sampling of a protein using the
>>>>>> decoupling scheme />>/  of the free energy module of gromacs.  However
>>>>>> when i decouple only the />>/  protein, the protein collapsed. Because
>>>>>> i simulated in NVT i thought that />>/  this was an effect of the
>>>>>> solvent. I was trying to decouple also the solvent />>/  to
>>>>>> understand the system behavior.
>>>>>> />>/
>>>>>> />
>>>>>>
>>>>>>> Rather than suspect that the solvent is the problem, it's more likely
>>>>>>> that decoupling an entire protein simply isn't stable.  I have never
>>>>>>> tried
>>>>>>>
>>>>>>> anything that enormous, but the volume change in the system could be
>>>>>>> unstable, along with any number of factors, depending on how you
>>>>>>> approach it.
>>>>>>>
>>>>>>> If you're looking for better sampling, REMD is a much more robust
>>>>>>> approach
>>>>>>>
>>>>>>> than trying to manipulate the interactions of huge parts of your
>>>>>>> system using the free energy code.
>>>>>> Presumably Luca is interested in some type of hamiltonian exchange
>>>>>> where lambda represents the interactions between the protein and the
>>>>>> solvent? This can actually be a useful method for enhancing sampling.
>>>>>> I think it's dangerous if we rely to heavily on "try something else".
>>>>>> I still see no methodological reason a priori why there should be any
>>>>>> actual slowdown, so that makes me think that it's an implementation
>>>>>> thing, and there is at least the possibility that this is something
>>>>>> that could be fixed as an enhancement.
>>>>>>
>>>>>> Chris.
>>>>>>
>>>>>>
>>>>>> -Justin
>>>>>>
>>>>>>> /   I expected a loss of performance, but not so drastic.
>>>>>> />/  Luca
>>>>>> />/
>>>>>> />>/  Load balancing problems I can understand, but why would it take
>>>>>> longer />>/  in absolute time? I would have thought that some nodes
>>>>>> would simple be />>/  sitting idle, but this should not cause an
>>>>>> increase in the overall />>/  simulation time (15x at that!).
>>>>>> />>/
>>>>>> />>/  There must be some extra communication?
>>>>>> />>/
>>>>>> />>/  I agree with Justin that this seems like a strange thing to do,
>>>>>> but />>/  still I think that there must be some underlying coding
>>>>>> issue (probably />>/  one that only exists because of a reasonable
>>>>>> assumption that nobody />>/  would annihilate the largest part of
>>>>>> their system). />>/
>>>>>> />>/  Chris.
>>>>>> />>/
>>>>>> />>/  Luca Bellucci wrote:
>>>>>> />>>/  /  Hi Chris,
>>>>>> />>/  />/  thank for the suggestions,
>>>>>> />>/  />/  in the previous mail there is a mistake because
>>>>>> />>/  />/  couple-moltype = SOL (for solvent) and not
>>>>>> "Protein_chaim_P". />>/  />/  Now the problem of the load balance
>>>>>> seems reasonable, because />>/  />/  the water box is large ~9.0 nm.
>>>>>> />>/  /
>>>>>> />>/  Now your outcome makes a lot more sense.  You're decoupling all
>>>>>> of the />>/  solvent? I don't see how that is going to be physically
>>>>>> stable or terribly /
>>>> --
>>>> gmx-users mailing list    gmx-users at gromacs.org
>>>> http://lists.gromacs.org/mailman/listinfo/gmx-users
>>>> Please search the archive at
>>>> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
>>>> Please don't post (un)subscribe requests to the list. Use the
>>>> www interface or send it to gmx-users-request at gromacs.org.
>>>> Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

-- 
========================================

Justin A. Lemkul
Ph.D. Candidate
ICTAS Doctoral Scholar
MILES-IGERT Trainee
Department of Biochemistry
Virginia Tech
Blacksburg, VA
jalemkul[at]vt.edu | (540) 231-9080
http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin

========================================