[gmx-users] GPU crashes

Thu Jun 7 13:25:31 CEST 2012

On 6/7/12 3:57 AM, lloyd riggs wrote:
> Did you play with the time step?  Just currious, but I woundered what
> happened with 0.0008, 0.0005, 0.0002.  I found if I had a good behaving
> protein, as soon as I added a small (non-protein) molecule which rotated
> wildly while attached to the protein, it would crash unless I reduced the
> time step to the above when constraints were removed after EQ ... always it
> seemed to me it didnt like the rotation or bond angles, seeing them as a
> violation but acted like it was an amino acid? (the same bond type but with
> wider rotation as one end wasnt fixed to a chain)  If your loop moves via
> backbone, the calculated angles, bonds or whatever might appear to the
> computer to be violating the parameter settings for problems, errors, etc as
> it cant track them fast enough over the time step. Ie atom 1-2-3 and then
> delta 1-2-3 with xyz parameters, but then the particular set has additional
> rotation, etc and may include the chain atoms which bend wildly (n-Ca-Cb-Cg
> maybe a dihedral) but proba! bly not this.
>
> Just a thought but probably not the right answere as well, it might be the
> way it is broken down (above) over GPUs, which convert everything to
> matricies (non-standard just for basic math operations not real matricies per
> say) for exicution and then some library problem which would not account for
> long range rapid (0.0005) movements at the chain (Ca,N,O to something else)
> and then tries to apply these to Cb-Cg-O-H, etc using the initial points
> while looking at the parameters for say a single amino acid...Maybe the
> constraints would cause this, which would make it a pain to EQ, but this
> allowed me to increase the time step, but would ruin the experiment I had
> worked on as I needed it unconstrained to show it didnt float away when
> proteins were pulled, etc...I was using a different integrator though...just
> normal MD.
>

I have long wondered if constraints were properly handled by the OpenMM library. 
  I am constraining all bonds, so in principle, dt of 0.002 should not be a 
problem.  The note printed indicates that the constraint algorithm is changed 
from the one selected (LINCS) to whatever OpenMM uses (SHAKE and a few others in 
combination).  Perhaps I can try running without constraints and a reduced dt, 
but I'd like to avoid it.

I wish I could efficiently test to see if this behavior was GPU-specific, but 
unfortunately the non-GPU implementation of the implicit code can currently only 
be run in serial or on 2 CPU due to an existing bug.  I can certainly test it, 
but due to the large number of atoms, it will take several days to even approach 
1 ns.

> ANd your cutoffs for vdw, etc...Why are they 0?  I dont know if this means a
> defautl set is then used...but if not ?  Wouldnt they try integrating using
> both types of formula, or would it be just using coulumb or vice versa? (dont
> know what that would do to the code but assume it means no vdw, and all
> coulumb but then zeros are alwyas a problem for computers).
>

The setup is for the all-vs-all kernels.  Setting cutoffs equal to zero and 
using a fixed neighbor list triggers these special optimized kernels.  I have 
also noticed that long, finite cutoffs (on the order of 4.0 nm) lead to 
unacceptable energy drift and structural instability in well-behaved systems 
(even the benchmarks).  For instance, the backbone RMSD of lysozyme is twice as 
large in the case of a 4.0-nm cutoff relative to the all-vs-all setup, and the 
energy drift is quite substantial.

-Justin

-- 
========================================

Justin A. Lemkul, Ph.D.
Research Scientist
Department of Biochemistry
Virginia Tech
Blacksburg, VA
jalemkul[at]vt.edu | (540) 231-9080
http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin

========================================