[gmx-developers] Reaction Filed crash!

Berk Hess hess at kth.se
Mon Dec 19 09:01:30 CET 2011

On 12/18/2011 04:29 PM, David van der Spoel wrote:
> On 12/18/11 1:27 AM, baptista at itqb.unl.pt wrote:
>> On Sat, 17 Dec 2011, Berk Hess wrote:
>>> I will try to clarify the situation a bit.
>>> Pre-4.5 there were 4 approximation/errors in this typical "Gromos" 
>>> setup
>>> and a complicating factor:
>>> 1. Non-buffered pair-lists reused for nstlist-1 steps: no real
>>> Hamiltonian
>>> 2. Reaction-field with epsilon_rf<infinity: discontinuous energy and
>>> force at cut-off
>>> 3. Twin-range interactions (or multiple time stepping) with 5*2=10 fs
>>> for the forces
>>> beyond 0.9 nm, but the reorientation of water only allows time steps
>>> up to 5 fs
>>> 4. A non-reversible multiple time stepping scheme
>>> 5. A thermostat with strong coupling (pre-4.0 also Berendsen: one
>>> other error)
>>> Factors 1-4 cause serious integration errors in the long-range
>>> electrostatics
>>> which all lead to serious heating of the system, but different for
>>> water and protein.
>>> Factor 5 couples off this heat, but with such strong heating you no
>>> longer know
>>> what happening.
>>> In version 4.5 I fixed factor 4. This takes out one source of error
>>> and heating.
>>> But as different errors affect different aspects/components
>>> differently, it's very
>>> difficult to tell what will happen.
>>> Factors 1-3 can be fixed at a high computational cost (although 4.6
>>> might help here).
>>> My hypothesis is that factor 4 caused massive heating everywhere and
>>> combined
>>> with a strongly coupled thermostat, all effects a seriously dampened,
>>> including those of 1-3.
>>> Fixing factor 4 gives less heating, thus less thermostat coupling and
>>> possibly less
>>> uniform heating between water and protein (due to effects of 1-3 being
>>> damped less),
>>> causing hydrogens in the protein to overheat. But as there are so many
>>> complicating factors,
>>> it's difficult to find out what's really happening.
>>> When I fixed factors 1-3, by different mdp settings, I get perfect
>>> energy conservation in 4.5,
>>> but that's at a high computational cost.
>>> Cheers,
>>> Berk
>> Thanks for clarifying the different factors involved. I see what you
>> mean. The relation between these factors is intricate and difficult to
>> antecipate, and trying to fix one of them ended up making things
>> unstable when using some setups (such as ours).
> As I pointed out earlier, due to large inherent heating in your setup, 
> the system becomes unstable, and one can delay the "fatal fluctuation" 
> but not indefinitely.
>> You mention that 4.6 might help with factors 1-3. Does that mean that
>> you are making further changes? Assuming that we decide to stick with
>> our current setup, would you then advise us to stay with 4.0 for now
>> and wait for 4.6 to be released?
> If Berk has implemented these things correctly (which he usually does) 
> 4.6 will make it more efficient to run with perfect energy 
> conservation, but the inherent problems in your setup remain.
I don't have a "magic" solution which will make perfectly accurate 
simulations as fast as the twin-range setup in 4.0.
However, new kernels in 4.6 will make buffered pair lists much faster 
than version 4.5.
Except for water, these new kernels will even be faster than unbuffered 4.5.
The force jump is solved by using epsilon_rf=0.
The twin-range efficiency issue can't be solved, as 10 fs is simply too 
large an integration step for water.
If you have a GPU, a full 1.4 nm cut-off will be about as fast as 4.0 
with twin-range.

And my take on the PME vs RF problem:
If you have interactions between periodic images with PME, this means 
you have relevant long-range interactions.
PME treats those incorrectly, RF simply ignores them completely. I think 
this means that the error with PME
is smaller, as treating them in some way is better than completely 
ignoring them. But anyhow, this means that
your unit cell is too small and you should enlarge it.
If there are no long-range interactions, PME and RF (possibly with a 
long cut-off) are equally valid.
If you there are medium or long-range interactions PME will always be 
more efficient than RF.


More information about the gromacs.org_gmx-developers mailing list