[gmx-users] Problem with umbrella sampling and g_wham

Eric LeGresley elegresl at sfu.ca
Tue Dec 27 20:23:16 CET 2011


Hello Justin,

You are a GOD!!!!!!  Thanks sooooo much!!  This is what the histogram looks like now,

http://dl.dropbox.com/u/32738905/Histo_good.png

I am under and over sampling in many places.  I will generate more sample points and increase the umbrella sampling simulation times to 10ns as recommended by you and your tutorial.

Thanks again,
Eric LeGresley

----- Original Message -----
From: "Justin A. Lemkul" <jalemkul at vt.edu>
To: "Discussion list for GROMACS users" <gmx-users at gromacs.org>
Sent: Monday, 26 December, 2011 14:48:31
Subject: Re: [gmx-users] Problem with umbrella sampling and g_wham



Eric LeGresley wrote:
> Hello all,
> For some time now I have been unable to find a solution to my problem.  I am attempting to do umbrella sampling on a protein ligand system.  To generate the starting configurations, I use the following mdp file:
> 
> 
> title       = Protein-ligand complex MD Simulation
> define      = -DPOSRES_SMD   ;keeps from moving
> ; Run parameters
> integrator  = md        ; leap-frog integrator
> nsteps      = 250000    ; 2 * 250000 fs = 500 ps (0.5 ns)
> dt          = 0.002     ; 2 fs
> ; Output control
> nstxout     = 0         ; suppress .trr output
> nstvout     = 0         ; suppress .trr output
> nstenergy   = 500      ; save energies every 1 ps
> nstlog      = 500      ; update log file every 1 ps
> nstxtcout   = 500      ; write .xtc trajectory every 1 ps
> energygrps  = Protein_CA MOL
> ; Bond parameters
> continuation    = yes           ; first dynamics run
> constraint_algorithm = lincs    ; holonomic constraints
> constraints     = all-bonds     ; all bonds (even heavy atom-H bonds) constrained
> lincs_iter      = 1             ; accuracy of LINCS
> lincs_order     = 4             ; also related to accuracy
> ; Neighborsearching
> ns_type     = grid      ; search neighboring grid cells
> nstlist     = 5         ; 10 fs
> rlist       = 0.9       ; short-range neighborlist cutoff (in nm)
> rcoulomb    = 0.9       ; short-range electrostatic cutoff (in nm)
> rvdw        = 1.4       ; short-range van der Waals cutoff (in nm)
> ; Electrostatics
> coulombtype     = PME       ; Particle Mesh Ewald for long-range electrostatics
> pme_order       = 4         ; cubic interpolation
> fourierspacing  = 0.16      ; grid spacing for FFT
> ;COM Pulling
> pull              = umbrella
> pull_geometry     = direction
> pull_start        = yes
> pull_dim          = Y Y Y
> pull_group0       = Protein_CA
> pull_group1       = MOL
> pull_vec1         = -2.4 -1.0 -0.4
> pull_rate1        = 0.008                   ; will pull 4nm during the 500ps run
> pull_k1           = 500
> ; Temperature coupling is on
> tcoupl      = nose-hoover
> tc-grps     = Protein_MOL Water_and_ions    ; two coupling groups - more accurate
> tau_t       = 0.2   0.2                     ; time constant, in ps
> ref_t       = 300   300                     ; reference temperature, one for each group, in K
> ; Pressure coupling is off
> pcoupl      = Parrinello-Rahman             ; pressure coupling is on for NPT
> pcoupltype  = isotropic                     ; uniform scaling of box vectors
> tau_p       = 2.0                           ; time constant, in ps
> ref_p       = 1.0                           ; reference pressure, in bar
> compressibility = 4.5e-5                    ; isothermal compressibility of water, bar^-1
> ; Periodic boundary conditions
> pbc         = xyz       ; 3-D PBC
> ; Dispersion correction
> DispCorr    = EnerPres  ; account for cut-off vdW scheme
> ; Velocity generation
> gen_vel     = yes       ; assign velocities from Maxwell distribution
> gen_temp    = 300       ; temperature for Maxwell distribution
> gen_seed    = -1        ; generate a random seed
> 
> A distance vs time graph of this 500ps simulation is here:
> 
> http://dl.dropbox.com/u/32738905/Dist_SMD500_upped50.png
> 
> 
> I then run umbrella sampling on 21 of the 500 generated starting configurations to use as input for g_wham.  The mdp file I use for the umbrella sampling is as follows:
> 
> title       = MD Run Umbrella
> define      = -DPOSRES_SMD  ; protein does not move
> ; Run parameters
> integrator  = md        ; leap-frog integrator
> nsteps      = 1000000     ; 2 * 1000000 = 2000 ps (2ns)
> dt          = 0.002     ; 2 fs
> tinit       = 0
> ; Output control
> nstxout     = 50000       ; save coordinates every 10 ps
> nstvout     = 50000       ; save velocities every 10 ps
> nstfout     = 5000       ; every 10 ps
> nstxtcout   = 5000       ; every 10 ps
> nstenergy   = 5000       ; save energies every 10 ps
> nstlog      = 5000       ; update log file every 10 ps
> energygrps  = Protein_CA MOL
> ; Bond parameters
> continuation    = yes           ; first dynamics run
> constraint_algorithm = lincs    ; holonomic constraints
> constraints     = all-bonds     ; all bonds (even heavy atom-H bonds) constrained
> lincs_iter      = 1             ; accuracy of LINCS
> lincs_order     = 4             ; also related to accuracy
> ; Neighborsearching
> ns_type     = grid      ; search neighboring grid cells
> nstlist     = 5         ; 10 fs
> rlist       = 0.9       ; short-range neighborlist cutoff (in nm)
> rcoulomb    = 0.9       ; short-range electrostatic cutoff (in nm)
> rvdw        = 1.4       ; short-range van der Waals cutoff (in nm)
> ; Electrostatics
> coulombtype     = PME       ; Particle Mesh Ewald for long-range electrostatics
> pme_order       = 4         ; cubic interpolation
> fourierspacing  = 0.16      ; grid spacing for FFT
> ; Temperature coupling is on
> tcoupl      = nose-hoover
> tc-grps     = Protein_MOL Water_and_ions    ; two coupling groups - more accurate
> tau_t       = 0.2   0.2                     ; time constant, in ps
> ref_t       = 300   300                     ; reference temperature, one for each group, in K
> ; Pressure coupling is off
> pcoupl      = Parrinello-Rahman             ; pressure coupling is on for NPT
> pcoupltype  = isotropic                     ; uniform scaling of box vectors
> tau_p       = 2.0                           ; time constant, in ps
> ref_p       = 1.0                           ; reference pressure, in bar 
> compressibility = 4.5e-5                    ; isothermal compressibility of water, bar^-1
> ; Periodic boundary conditions
> pbc         = xyz       ; 3-D PBC
> ; Dispersion correction
> DispCorr    = EnerPres  ; account for cut-off vdW scheme
> ; Velocity generation
> gen_vel     = no       ; do not assign velocities
> ; Pull Code
> pull              = umbrella
> pull_geometry     = distance
> pull_start        = yes
> pull_dim          = Y Y Y
> pull_group0       = Protein_CA
> pull_group1       = MOL
> pull_rate1        = 0.0
> pull_k1           = 25000
> pull_nstxout      = 1000     ; every 2 ps
> pull_nstfout      = 1000     ; every 2 ps
> 
> Once this is finished, I run g_wham and I get the following output files:
> 
> Histogram:  http://dl.dropbox.com/u/32738905/Histogram_deletable2.png

Here you have only one histogram, so likely you have not plotted the .xvg file 
properly.  Use:

xmgrace -nxy histo.xvt

> Profile:    http://dl.dropbox.com/u/32738905/Profile_deletable2.png
> 

With the above histogram plotted properly, you should be able to identify 
regions of poor sampling.  Quite like 2 ns is insufficient in each window, or 
you otherwise need to add more windows.

> Both graphs begin at 1.28nm which is the starting distance between the protein and the ligand but there is something very wrong with them.  I was expecting something similar the output of Justin's umbrella sampling tutorial, since I based what I did on his tutorial.  All that is different is I am using the AMBERsb99 forcefield and I am doing a protein-ligand system not a peptide-peptide system.  If somebody could recommend something for me to do about this problem it would be greatly appreciated.  Also, if you would like additional information about the problem feel free to email me.
> 
> Thanks for your time,
> Eric LeGresley
> 
> P.S. I'm sorry that what I have written may seem a little simple to some people, but I am 15 years old and I only started using GROMACS a few months ago.

It is a long hard road before one can do quality simulations, so keep at it. 
Read all you can, ask questions about methodology, and don't be afraid to rely 
on "empirical" discoveries from time to time ;)

-Justin

-- 
========================================

Justin A. Lemkul
Ph.D. Candidate
ICTAS Doctoral Scholar
MILES-IGERT Trainee
Department of Biochemistry
Virginia Tech
Blacksburg, VA
jalemkul[at]vt.edu | (540) 231-9080
http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin

========================================
-- 
gmx-users mailing list    gmx-users at gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-request at gromacs.org.
Can't post? Read http://www.gromacs.org/Support/Mailing_Lists



More information about the gromacs.org_gmx-users mailing list