[gmx-users] Re: gromacs 4.6 segfault

Justin Lemkul jalemkul at vt.edu
Tue Jan 15 13:19:49 CET 2013



On 1/15/13 7:16 AM, Dr. Vitaly Chaban wrote:
> On Tue, Jan 15, 2013 at 1:09 PM, Justin Lemkul <jalemkul at vt.edu> wrote:
>>
>>
>> On 1/15/13 7:06 AM, Dr. Vitaly Chaban wrote:
>>>>
>>>> using mdrun (version 4.6-beta3) on a GPU node (1 nvidia K10 with cuda
>>>> drivers and runtime 4.2 + 2 times intel 6 core E5 with hyper threading
>>>> and SSE4.1) I get allways after a few or few 100 ns the following
>>>> segfault:
>>>>
>>>> line 15: 28957 Segmentation fault mdrun -deffnm pdz_trans_NVT_equi_4
>>>> -maxh 95
>>>>
>>>> I can restart the system using the cpt file and run the it for the next
>>>> few or few 100 ns and when I get the same segfault again.
>>>> The same system runs on a different cluster (mdrun version 4.6-beta3 on
>>>> a GPU node (1 nvidia M2090 with cuda drivers and runtime 4.2 + 2 times
>>>> intel 6 core X5 and SSE4.1) fine for 1μs without any complains.
>>>>
>>>> My system consists of a 95 residue protein solvated in approx 6000 spc
>>>> water molecules.
>>>> .mdp parameters:
>>>>
>>>> ;
>>>> title = ttt
>>>> cpp = /lib/cpp
>>>> include = -I../top
>>>> constraints = hbonds
>>>> integrator = md
>>>> cutoff-scheme = verlet
>>>>
>>>> dt = 0.002 ; ps !
>>>> nsteps = 500000000 ; total 5 ns
>>>> nstcomm = 25 ; frequency for center of mass motion removal
>>>> nstcalcenergy = 25
>>>> nstxout = 100000 ; frequency for writting the trajectory
>>>> nstvout = 100000 ; frequency for writting the velocity
>>>> nstfout = 100000 ; frequency to write forces to output trajectory
>>>> nstlog = 1000000 ; frequency to write the log file
>>>> nstenergy = 10000 ; frequency to write energies to energy file
>>>> nstxtcout = 10000
>>>>
>>>> xtc_grps = System
>>>>
>>>> nstlist = 25 ; Frequency to update the neighbor list
>>>> ns_type = grid ; Make a grid in the box and only check atoms in
>>>> neighboring grid cells when constructing a new neighbor
>>>> rlist = 1.4 ; cut-off distance for the short-range neighbor list
>>>>
>>>> coulombtype = PME ; Fast Particle-Mesh Ewald electrostatics
>>>> rcoulomb = 1.4 ; cut-off distance for the coulomb field
>>>> vdwtype = cut-off
>>>> rvdw = 1.4 ; cut-off distance for the vdw field
>>>> fourierspacing = 0.12 ; The maximum grid spacing for the FFT grid
>>>> pme_order = 6 ; Interpolation order for PME
>>>> optimize_fft = yes
>>>> pbc = xyz
>>>> Tcoupl = v-rescale
>>>> tc-grps = System
>>>> tau_t = 0.1
>>>> ref_t = 300
>>>>
>>>> energygrps = Protein Non-Protein
>>>>
>>>> Pcoupl = no;berendsen
>>>> tau_p = 0.1
>>>> compressibility = 4.5e-5
>>>> ref_p = 1.0
>>>> nstpcouple = 5
>>>> refcoord_scaling = all
>>>> Pcoupltype = isotropic
>>>> gen_vel = no
>>>> gen_temp = 300
>>>> gen_seed = -1
>>>>
>>>> Since I have no clue on which paramter should be tuned any guess would
>>>> be very welcomed.
>>>>
>>>
>>> I think the reason of the issue is outside your MDP file and is rather
>>> in the GPU installation. A primitive advice would be to decrease a
>>> time-step, say twice, and see what happens. Even very well
>>> equilibrated systems and even without GPU support, sometimes crash
>>> after a few millions of steps...
>>>
>>
>> The fact that the run restarts from a checkpoint and runs for a long period
>> of time and also runs on different hardware argues against that statement.
>
>
> Against the statement that the problem is outside MDP file? Hmmm...
>

Sorry, read that wrong.  I got confused when you started suggesting "primitive 
advice" and that the OP may be suffering from a sporadic crash that could happen 
to anyone at any time.

-Justin

-- 
========================================

Justin A. Lemkul, Ph.D.
Research Scientist
Department of Biochemistry
Virginia Tech
Blacksburg, VA
jalemkul[at]vt.edu | (540) 231-9080
http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin

========================================



More information about the gromacs.org_gmx-users mailing list