[gmx-users] FEP and loss of performance
Justin A. Lemkul
jalemkul at vt.edu
Thu Apr 7 14:54:35 CEST 2011
I would suggest that you take Chris' advice and post all of this as a feature
request on redmine.gromacs.org so that it can be put on a to-do list. Enhancing
the performance of the free energy code is probably going to be a low-priority,
long-term goal (in the absence of any proven bug), but at least it won't get
lost in the shuffle of the mailing list. If there's no record of it in redmine,
it likely won't get addressed. Gromacs is undergoing major changes at the
moment, so the core developers are quite busy with other priorities.
-Justin
Luca Bellucci wrote:
> I posted my test files in:
> https://www.dropbox.com/link/17.-sUcJyMeEL?k=0f3b6fa098389405e7e15c886dcc83c1
> This is a run for a dialanine peptide in a water box.
> The cell side cubic box was 40 A.
> The directory is organized as :
> TEST\
> topol.top
> Run-00/confout.gro ; Equilibrated structure
> Run-00/state.cp
>
> MD-std/Commands ; commands to run the simulation , grompp and mdrun
> MD-std/md.mdp
>
> MD-FEP/Commands
> MD-FEP/md.mdp
>
> ~700 kb
>
>
>> David Mobley wrote:
>>> Hi,
>>>
>>> This doesn't sound like normal behavior. In fact, this is not what I
>>> typically observe. While there may be a small performance difference,
>>> it is probably at the level of a few percent. Certainly not a factor
>>> of more than 10.
>> I see about a 50% reduction in speed when decoupling small molecules in
>> water. For me, I don't care if a nanosecond takes 2 or 3 hours. For
>> larger systems such as the ones considered here, it seems that the
>> performance loss is much more dramatic.
>>
>> I can reproduce the poor performance with a simple water box with the free
>> energy code on. Decoupling the whole system (or at least, a large part of
>> it, as was the original intent of this thread, as I understand it) results
>> in a 1500% slowdown. Some observations:
>>
>> 1. Water optimizations are turned off when decoupling the water, but this
>> only accounts for 20% of the slowdown, which is relatively insignificant.
>>
>> 2. Using lambda=0.9 (from a previous post) in my water box results in even
>> worse performance, but much of this is due to DD instability. The system
>> I used has a few hundred water molecules in it, and after about 10-12 ps,
>> they collapse in on one another and form clusters, dramatically shifting
>> the balance of atoms between DD cells. DLB gets activated but the force
>> imbalances are around 40%, and the total slowdown (relative to
>> non-perturbed trajectories) is 2000%.
>>
>> 3. Using lambda=0 results in stable trajectories with very low imbalance,
>> but also poor performance. It seems that mdrun spends all of its time in
>> the free energy innerloops:
>>
>> Computing: M-Number M-Flops %
>> Flops
>> --------------------------------------------------------------------------
>> --- Free energy innerloop 19064.187513 2859628.127
>> 89.1 Outer nonbonded loop 325.153806 3251.538
>> 0.1 Calc Weights 231.754635 8343.167
>> 0.3 Spread Q Bspline 9888.197760 19776.396
>> 0.6 Gather F Bspline 9888.197760 59329.187
>> 1.8 3D-FFT 24406.688124 195253.505
>> 6.1 Solve PME 485.109702 31047.021
>> 1.0 NS-Pairs 521.616615 10953.949
>> 0.3 Reset In Box 2.575515 7.727
>> 0.0 CG-CoM 7.728090 23.184
>> 0.0 Virial 8.176635 147.179
>> 0.0 Update 77.251545 2394.798
>> 0.1 Stop-CM 0.774045 7.740
>> 0.0 Calc-Ekin 77.253090 2085.833
>> 0.1 Constraint-V 77.253090 618.025
>> 0.0 Constraint-Vir 7.726545 185.437
>> 0.0 Settle 51.502060 16635.165
>> 0.5
>> --------------------------------------------------------------------------
>> --- Total 3209687.978
>> 100.0
>> --------------------------------------------------------------------------
>> ---
>>
>>> You may want to provide an mdp file and topology, etc. so someone can
>>> see if they can reproduce your problem.
>> I agree that would be useful. I can contribute my water box system if it
>> would help, as well.
>>
>> -Justin
>>
>>> Thanks.
>>>
>>> On Wed, Apr 6, 2011 at 7:59 AM, Luca Bellucci <lcbllcc at gmail.com> wrote:
>>>> I followed your suggestions and i tried to perform a MD run wit GROMACS
>>>> and NAMD for dialanine peptide in a water box. The cell side cubic box
>>>> was 40 A.
>>>>
>>>> GROMACS:
>>>> With the free energy module there is a drop in gromacs performance of
>>>> about 10/20 fold.
>>>> Standard MD: Time: 6.693 6.693 100.0
>>>> Free energy MD: Time: 136.113 136.113 100.0
>>>>
>>>> NAMD:
>>>> With free energy module there is not a drop in performance so evident
>>>> as in gromacs.
>>>> Standard MD 6.900000
>>>> Free energy MD 9.600000
>>>>
>>>> I would like to point out that this kind of calculation is common, in
>>>> fact in the manual of gromacs 4.5.3 it is reported " There is a
>>>> special option system that couples all molecules types in the system.
>>>> This can be useful for equilibrating a system [..] ".
>>>>
>>>> Actually, I would understand if there is a solution to resolve the drop
>>>> in gromacs performance for this kind of calculation.
>>>>
>>>> Luca
>>>>
>>>>> I don't know if it is possible or not. I think that you can enhance
>>>>> your chances of developer attention if you develop a small and simple
>>>>> test system that reproduces the slowdown and very explicitly state
>>>>> your case for why you can't use some other method. I would suggest
>>>>> posting that to the mailing list and, if you don't get any response,
>>>>> post it as an enhancement request on the redmine page (or whatever has
>>>>> taken over from bugzilla).
>>>>>
>>>>> Good luck,
>>>>> Chris.
>>>>>
>>>>> -- original message --
>>>>>
>>>>>
>>>>> Yes i am testing the possibility to perform an Hamiltonian-REMD
>>>>> Energy barriers can be overcome increasing the temperature system or
>>>>> scaling potential energy with a lambda value, these methods are
>>>>> "equivalent". Both have advantages and disavantages, at this stage it
>>>>> is not the right place to debate on it. The main problem seems to be
>>>>> how to overcome to the the loss of gromacs performance in such
>>>>> calculation. At this moment it seems an intrinsic code problem.
>>>>> Is it possible?
>>>>>
>>>>>> >> Dear Chris and Justin
>>>>>>>> / Thank you for your precious suggestions
>>>>>> />>/ This is a test that i perform in a single machine with 8 cores
>>>>>> />>/ and gromacs 4.5.4.
>>>>>> />>/
>>>>>> />>/ I am trying to enhance the sampling of a protein using the
>>>>>> decoupling scheme />>/ of the free energy module of gromacs. However
>>>>>> when i decouple only the />>/ protein, the protein collapsed. Because
>>>>>> i simulated in NVT i thought that />>/ this was an effect of the
>>>>>> solvent. I was trying to decouple also the solvent />>/ to
>>>>>> understand the system behavior.
>>>>>> />>/
>>>>>> />
>>>>>>
>>>>>>> Rather than suspect that the solvent is the problem, it's more likely
>>>>>>> that decoupling an entire protein simply isn't stable. I have never
>>>>>>> tried
>>>>>>>
>>>>>>> anything that enormous, but the volume change in the system could be
>>>>>>> unstable, along with any number of factors, depending on how you
>>>>>>> approach it.
>>>>>>>
>>>>>>> If you're looking for better sampling, REMD is a much more robust
>>>>>>> approach
>>>>>>>
>>>>>>> than trying to manipulate the interactions of huge parts of your
>>>>>>> system using the free energy code.
>>>>>> Presumably Luca is interested in some type of hamiltonian exchange
>>>>>> where lambda represents the interactions between the protein and the
>>>>>> solvent? This can actually be a useful method for enhancing sampling.
>>>>>> I think it's dangerous if we rely to heavily on "try something else".
>>>>>> I still see no methodological reason a priori why there should be any
>>>>>> actual slowdown, so that makes me think that it's an implementation
>>>>>> thing, and there is at least the possibility that this is something
>>>>>> that could be fixed as an enhancement.
>>>>>>
>>>>>> Chris.
>>>>>>
>>>>>>
>>>>>> -Justin
>>>>>>
>>>>>>> / I expected a loss of performance, but not so drastic.
>>>>>> />/ Luca
>>>>>> />/
>>>>>> />>/ Load balancing problems I can understand, but why would it take
>>>>>> longer />>/ in absolute time? I would have thought that some nodes
>>>>>> would simple be />>/ sitting idle, but this should not cause an
>>>>>> increase in the overall />>/ simulation time (15x at that!).
>>>>>> />>/
>>>>>> />>/ There must be some extra communication?
>>>>>> />>/
>>>>>> />>/ I agree with Justin that this seems like a strange thing to do,
>>>>>> but />>/ still I think that there must be some underlying coding
>>>>>> issue (probably />>/ one that only exists because of a reasonable
>>>>>> assumption that nobody />>/ would annihilate the largest part of
>>>>>> their system). />>/
>>>>>> />>/ Chris.
>>>>>> />>/
>>>>>> />>/ Luca Bellucci wrote:
>>>>>> />>>/ / Hi Chris,
>>>>>> />>/ />/ thank for the suggestions,
>>>>>> />>/ />/ in the previous mail there is a mistake because
>>>>>> />>/ />/ couple-moltype = SOL (for solvent) and not
>>>>>> "Protein_chaim_P". />>/ />/ Now the problem of the load balance
>>>>>> seems reasonable, because />>/ />/ the water box is large ~9.0 nm.
>>>>>> />>/ /
>>>>>> />>/ Now your outcome makes a lot more sense. You're decoupling all
>>>>>> of the />>/ solvent? I don't see how that is going to be physically
>>>>>> stable or terribly /
>>>> --
>>>> gmx-users mailing list gmx-users at gromacs.org
>>>> http://lists.gromacs.org/mailman/listinfo/gmx-users
>>>> Please search the archive at
>>>> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
>>>> Please don't post (un)subscribe requests to the list. Use the
>>>> www interface or send it to gmx-users-request at gromacs.org.
>>>> Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
--
========================================
Justin A. Lemkul
Ph.D. Candidate
ICTAS Doctoral Scholar
MILES-IGERT Trainee
Department of Biochemistry
Virginia Tech
Blacksburg, VA
jalemkul[at]vt.edu | (540) 231-9080
http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin
========================================
More information about the gromacs.org_gmx-users
mailing list