[gmx-users] combining differently-generated force-fields

Fri May 2 07:19:23 CEST 2008

I don't have a problem, per se, but would like to discuss the problems  
that may, or may not, arise when mixing force fields.

It is clear to me why one would not want to calculate the free energy  
of binding for two proteins, one using the amber ff and the other  
using the opls ff; also it is clear that there would be problems  
simulating a box of water half of which is tip3p and half of which is  
spc. The common thing to these examples is that such simulations would  
apply dissimilar parameter sets for similar functional groups and  
therefore any results could be subject to significant biases, the  
source of which will not be obvious to the user.

However, If one was simulating the binding of a protein to DNA, or a  
protein embedded in a lipid bilayer, the functional groups are no  
longer shared by different types of macromolecules. Since I work on  
membrane proteins, let me take the case of an oplsaa protein in a  
Berger lipid bilayer. Not only are these ff's differently generated,  
but one is all-atom and one is united-atom. The important difference  
in this case is that there are few functional groups of the lipids  
that resemble those of the protein e.g. the NH3 of a lipid head-group  
choline and a lysine of the protein. Generally though, the functional  
groups are entirely different between these macromolecules. I believe  
that this is also the case for protein-DNA simulations. Therefore,  
what biases can possibly occur by the combination of different ff's in  
this case that could not also occur by combinations that exclusively  
use a single ff?

I take the extreme example and ask: what special relevance do the opls  
ion parameters have to the opls protein parameters? It seems to me  
that, although they "derive them in a manner consistent with how the  
rest of the force field was originally derived"  
(http://wiki.gromacs.org/index.php/Parameterization), in this extreme  
case I believe that this is an entirely abstract concept of no  
particular value. In other words, how can Na+ possibly be generated  
consistently/inconsistently with an amino acid that contains no Na?

To clearly state my current point of view in the absence of a shred of  
data, I suggest the following: "One should not combine parameters that  
are derived inconsistently of one another except in cases where such  
combination can be made without introducing multiple parametric  
definitions of a given functional group." If you believe that, it  
would therefore be acceptable to combine the following in any way: i)  
protein, ii) water, iii) ion, iv) DNA, v) lipid, vi) carbohydrate. The  
seventh group: small molecules, is difficult to classify since one  
must take into consideration the specific functional groups. For  
example, I would suggest that ATP and a protein should be fine if  
different ff's are used, but that ATP and DNA should use a consistent  
ff when simulated in conjunction.

As we ramp up our simulations for ever-increasing cpu power and for  
gromacs 4, these questions are well beyond pedantic. It is one thing  
to develop parameters for a small molecule consistently with the the  
methodology used for the protein/DNA ff. However, simulations of more  
than one different type of macromolecule (e.g. protein-DNA  
simulations) would greatly benefit, it seems, from the ability to use  
the DNA parameters that lead to the most accurate sampling of DNA  
phase space and the protein parameters that lead to the most accurate  
sampling of protein phase space. It is my conjecture that such  
combinations would not only be appropriate, but that they would be  
optimal.

Disclaimer: If you are considering combining differently-generated  
force-fields, please do not take this post as encouragement. The  
standard logic never to combine force-fields is still recommended. I  
only wanted to have some discussion on this topic.

Thanks for all comments, especially those that are in disagreement  
with my proposition.

Chris.