[gmx-users] Umbrella Sampling and flat-bottom position restraints

Justin Lemkul jalemkul at vt.edu
Mon Nov 17 18:28:04 CET 2014

On 11/16/14 5:44 PM, Oliver Stueker wrote:
> Hi Justin,
> Thanks for the feedback.
> On Sun, Nov 16, 2014 at 2:51 PM, Justin Lemkul <jalemkul at vt.edu> wrote:
>> On 11/16/14 1:04 PM, Oliver Stueker wrote:
>>> Dear fellow Gromacs Users,
>>> I'm trying to calculate the binding free energy between a protein and a
>>> small molecule using Umbrella Sampling.
>>> I made my first attempt last year using Gromacs 4.5 and ran into the same
>>> convergence problems that have been pointed out on the mailing list
>>> several
>>> times.
>>> To overcome these problems I now want to give it another try utilizing the
>>> flat-bottomed position restraints that have been introduced in Gromacs
>>> 5.0.
>>> My plan is to perform the Umbrella sampling along the Z-axis and to
>>> restrict the protein as well as the ligand with flat-bottomed position
>>> restraints along X and Y into a cylindrical space (as suggested e.g. by
>>> Chris Neale in [1]).
>>> But I'm not quite sure what radius the cylinder should have ( or how wide
>>> the flat bottom should be).
>>> The protein is roughly spherical with a diameter of ~4nm. I was initially
>>> thinking to use an r_{fb} of 2nm for the protein and  for the ligand 1/2
>>> of
>>> it's longest dimension to allow both of them to re-orient freely. But the
>>> more I think about, I'm afraid that it's too large.
>> Why is it necessary to pull only along the z-axis?  For simple ligand
>> binding to a globular protein, you can define the pull vector in any or all
>> of the three spatial dimensions.  Is there some reason you presuppose
>> binding only along z?
> ​Hmm, indeed it's probably not necessary to ​pull only along Z. However the
> box would need to be much larger if I can't control the direction.
> In my first attempt I had already aligned the vector between the COMs of
> Protein and ligand with the Z-axis so that I can pull along that direction.
> That time I've used a box of 10 x 10 x 15 nm.
> Last time I've used the following pull settings:
> ; Pull code
> pull            = umbrella
> pull_geometry   = distance
> pull_dim        = N N Y
> pull_start      = yes
> pull_ngroups    = 1
> pull_group0     = Protein
> pull_group1     = LIG
> pull_init1      = 0
> pull_rate1      = 0.0
> pull_k1         = 1000      ; kJ mol^-1 nm^-2
> pull_nstxout    = 1000      ; every 2 ps
> pull_nstfout    = 1000      ; every 2 ps
> ​I also had the problem that during the equillibration of the Umbrella
> windows the ligand ​moved a bit away from it's initial position so that the
> initial pull distance was a bit off. This time I want to prevent that by
> setting pull_init for each window directly and leave pull_start = off.
> ​Say if I ​use pull_dim = Y Y Y and a cubic box of 15 x 15 x 15 nm. That
> would restrict the ligand to the surface of a sphere of radius d (COM
> reference distance of the umbrella window) around the protein. Wouldn't
> that mean that I would need to sample the ligand conformations on the
> complete surface of each sphere to achieve sufficient conformational
> overlap between neighbouring windows? Those surfaces would become quite big
> once I reach to high COM distances (6 nm).

More judicious choice of reaction coordinate eliminates this problem.  Unless 
the binding site is coincident with the COM of the protein, it is better to 
choice a few binding site residues as the reference, thus more clearly defining 
the reaction coordinate.  At longer range, anyway, the interactions between the 
ligand and any protein residue are negligible, so I would not think you would 
need to sample the entire sphere around the reference group.  Indeed, this would 
likely require microseconds of sampling, which no one does.

> Say if in two windows n-1 and n+1 the ligand moves towards +X / +Y and in
> window n (in between) the ligand moves to -X / -Y. Then while the
> histograms might overlap nicely (according the sampled COM distances) the
> sampled conformations might be very different.

True, but the spacing of the windows in this case also depends on the 
flexibility of the ligand.

> ​It seems that simply using pull_dim = Y Y Y would also not really solve
> the problem.​

I disagree.  I think the choice of z-only pull in the case of a free-floating 
ligand is making a large assumption, and one based solely on convenience.  Note 
that this is a very different case from my tutorial, in which uni-directional, 
linear growth of an aggregate is experimentally proven; so don't lock yourself 
into that thinking too much.

>>   Has anyone have experience with that? Maybe even a published paper?
>>> So far I've unsuccessfully tried several times to find a published method.
>>> Also to I need to consider the flat-bottomed potential when doing the
>>> g_wham analysis?
>> It's an additional bias to the Hamiltonian, so it should probably be
>> accounted for in some way.  As to how, I'm not sure.  But it's an
>> additional degree of freedom that is being restrained.
> ​Yes, you are probably right.
> Would this work:
>     - generate conformations by pulling along Z and select conformations for
>     windows
>     - equillibrate each window
>     - run umbrella-sampling (to sample conformations/coordinates) along Z
>     with pull_init = d(window) and use flat-bottomed PosRes to restrict Protein
>     and Ligand in X/Y movement.
>        - as reference coordinates, I use the COM of Protein and ligand
>        respectively
>        - as r_{fb} I can use the radius of the sphere that encloses the
>        protein (or ligand) plus a small buffer.
>        - store coordinates in XTC and discard the pullx- / pullf- files.
>     - Do a re-run umbrella-sampling (to generate distances/forces) with pull_dim
>     = Y Y Y and without the PosRes to generate the pullx / pullf files that
>     will be used for g_wham.
> ​Instead of the last re-run I could maybe also run g_dist to generate my
> pullx.xvg files and then pass g_wham tpr files with pull_dim = Y Y Y and
> without the PosRes.

The conformations you sample are still biased by the flat-bottom restraint, even 
if it is infrequent.  Simply re-processing does not eliminate this effect.



Justin A. Lemkul, Ph.D.
Ruth L. Kirschstein NRSA Postdoctoral Fellow

Department of Pharmaceutical Sciences
School of Pharmacy
Health Sciences Facility II, Room 629
University of Maryland, Baltimore
20 Penn St.
Baltimore, MD 21201

jalemkul at outerbanks.umaryland.edu | (410) 706-7441


More information about the gromacs.org_gmx-users mailing list