[gmx-developers] segfault goes away when restarting from cpt
Aleksei Iupinov
aleksei.iupinov at scilifelab.se
Fri May 5 13:13:17 CEST 2017
Hello Michael,
You are welcome to register and file the bug at the
https://redmine.gromacs.org/ issue tracker.
There you can attach the input file and the logs as well (so that we know
the exact Gromacs version, etc).
Best regards,
Aleksei
On Fri, May 5, 2017 at 10:23 AM, Michael Brunsteiner <mbx0009 at yahoo.com>
wrote:
>
> hi,
>
> I post this here as it might be developers rather than a user issue ...
> I ran an NPT sim with simulated annealing of an amorphous solid sample with
> some organic molecules,as in:
>
> gmx grompp -f md-1bar-353-253.mdp -p sis3-7-simp.top -c up-nr2-3.gro -o
> do-nr2-3.tpr
> nohup gmx mdrun -v -deffnm do-nr2-3 > er 2>&1 &
>
> after around 30 nano secs the simulation stops without further notice.
> neither in the log-file nor in stdout or stderr there are any indicators
> of what happened
> but when i look into the relevant syslog file i find:
>
> May 5 03:24:38 rcpe-sbd-node03 kernel: [82541302.295784] gmx[2218]:
> segfault at ffffffff9d3ebea0 ip 00007f5b708be3a1 sp 00007f5b657f9dc0 error
> 7 in libgromacs.so.2.3.0[7f5b706d9000+1d1e000]
>
> when i restart the sim on the same computer and from the last cpt file, as
> in:
>
> nohup gmx mdrun -v -deffnm do-nr2-3 -cpi do-nr2-3.cpt -noappend > er 2>&1 &
>
> the sim happily continues beyond the point where it previously seg-faulted
> without any further issues ...
>
> tpr file is too large to attach (if anybody's interested i can upload it
> somewhere)
> below i put the last 30 or so lines of both stderr+stdout and the log-file
> I believe the warning at the end of stderr is harmless, but even if it
> actually is the reason
> for the segfault this still does not explain why nothing is written to
> stderr when it happens
> and why the sim works when restarted from the cpt file ... can it be that
> this is a hardware issue??
>
> regards,
> Michael
>
>
> stderr+stdout:
> [..]
> Brand: Intel(R) Core(TM) i7-4930K CPU @ 3.40GHz
> SIMD instructions most likely to fit this hardware: AVX_256
> SIMD instructions selected at GROMACS compile time: AVX_256
>
> Hardware topology: Full, with devices
> GPU info:
> Number of GPUs detected: 1
> #0: NVIDIA GeForce GTX 780, compute cap.: 3.5, ECC: no, stat:
> compatible
>
> Reading file do-nr2-3.tpr, VERSION 2016.3 (single precision)
> Changing nstlist from 20 to 40, rlist from 1.2 to 1.2
>
> Using 1 MPI thread
> Using 12 OpenMP threads
>
> 1 compatible GPU is present, with ID 0
> 1 GPU auto-selected for this run.
> Mapping of GPU ID to the 1 PP rank in this node: 0
>
> starting mdrun 'system'
> 110000000 steps, 110000.0 ps.
> step 80: timed with pme grid 40 40 24, coulomb cutoff 1.200: 81.2
> M-cycles
> step 80: the box size limits the PME load balancing to a coulomb cut-off
> of 1.368
> step 160: timed with pme grid 32 36 24, coulomb cutoff 1.368: 72.9
> M-cycles
> step 240: timed with pme grid 36 36 24, coulomb cutoff 1.264: 75.7
> M-cycles
> step 320: timed with pme grid 36 40 24, coulomb cutoff 1.216: 78.6
> M-cycles
> step 400: timed with pme grid 40 40 24, coulomb cutoff 1.200: 81.2
> M-cycles
> optimal pme grid 32 36 24, coulomb cutoff 1.368
> step 31031000, will finish Fri May 5 14:27:08 2017
> Step 31031061 Warning: pressure scaling more than 1%, mu: 0.999153
> 0.982333 0.997814
>
>
>
> log-file:
> [..]
> Step Time
> 31030000 31030.00000
>
> Current ref_t for group System: 327.9
> Energies (kJ/mol)
> Bond Angle Proper Dih. Improper Dih. LJ-14
> 8.65826e+03 1.38650e+04 1.08781e+04 4.26101e+02 6.01339e+03
> Coulomb-14 LJ (SR) Coulomb (SR) Coul. recip. Potential
> -2.98635e+04 -1.12638e+04 1.63267e+04 2.11350e+02 1.52514e+04
> Kinetic En. Total Energy Temperature Pressure (bar)
> 2.22541e+04 3.75055e+04 3.28069e+02 6.14802e+02
>
> Step Time
> 31031000 31031.00000
>
> Current ref_t for group System: 327.8
> Energies (kJ/mol)
> Bond Angle Proper Dih. Improper Dih. LJ-14
> 8.41290e+03 1.38660e+04 1.08950e+04 3.51583e+02 5.79937e+03
> Coulomb-14 LJ (SR) Coulomb (SR) Coul. recip. Potential
> -2.99386e+04 -1.15255e+04 1.64549e+04 2.30994e+02 1.45468e+04
> Kinetic En. Total Energy Temperature Pressure (bar)
> 2.27307e+04 3.72775e+04 3.35095e+02 -1.54620e+00
>
>
>
>
>
>
>
>
> ------------------------------
> --
> Gromacs Developers mailing list
>
> * Please search the archive at http://www.gromacs.org/
> Support/Mailing_Lists/GMX-developers_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers or
> send a mail to gmx-developers-request at gromacs.org.
>
>
> --
> Gromacs Developers mailing list
>
> * Please search the archive at http://www.gromacs.org/
> Support/Mailing_Lists/GMX-developers_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
> or send a mail to gmx-developers-request at gromacs.org.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20170505/63546cd2/attachment.html>
More information about the gromacs.org_gmx-developers
mailing list