[gmx-users] Long trajectory split

Mark Abraham mark.j.abraham at gmail.com
Sun Feb 23 20:21:27 CET 2014


On Sun, Feb 23, 2014 at 6:48 PM, Marcelo Depólo <marcelodepolo at gmail.com>wrote:

> Justin, the other runs with the very same binary do not produce the same
> problem.
>
> Mark, I just omitted the _mpi of the line here, but is was compiled as
> _mpi.
>

OK, that rules that problem out, but please don't simplify and approximate.
Computers are exact, and trouble shooting problems with them requires all
the information. If we all understood perfectly we wouldn't be having
problems ;-)

Those files do get closed at checkpoint intervals, so they can be hashed
for the hash value to be saved in the checkpoint. It is conceivable some
file system would not close-and-re-open them properly. The .log files would
comment about at least some such conditions.

But the real question is what you are doing differently from the times when
you have observed normal behaviour!

Mark


> My log file top:
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *Gromacs version:    VERSION 4.6.1Precision:          singleMemory
> model:       64 bitMPI library:        MPIOpenMP support:     disabledGPU
> support:        disabledinvsqrt routine:    gmx_software_invsqrt(x)CPU
> acceleration:   SSE4.1FFT library:        fftw-3.3.2-sse2Large file
> support: enabledRDTSCP usage:       enabledBuilt on:           Sex Nov 29
> 16:08:45 BRST 2013Built by:           root at jupiter [CMAKE]Build
> OS/arch:      Linux 2.6.32.13-0.4-default x86_64Build CPU vendor:
> GenuineIntelBuild CPU brand:    Intel(R) Xeon(R) CPU           X5650  @
> 2.67GHzBuild CPU family:   6   Model: 44   Stepping: 2Build CPU features:
> apic clfsh cmov cx8 cx16 htt lahf_lm mmx msr nonstop_tsc pcid pdcm pdpe1gb
> popcnt pse rdtscp sse2 sse3 sse4.1 sse4.2 ssse3(...)*
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *Initializing Domain Decomposition on 24 nodesDynamic load balancing:
> autoWill sort the charge groups at every domain (re)decompositionInitial
> maximum inter charge-group distances:    two-body bonded interactions:
> 0.621 nm, LJ-14, atoms 3801 3812  multi-body bonded interactions: 0.621 nm,
> G96Angle, atoms 3802 3812Minimum cell size due to bonded interactions:
> 0.683 nmMaximum distance for 5 constraints, at 120 deg. angles, all-trans:
> 0.820 nmEstimated maximum distance required for P-LINCS: 0.820 nmThis
> distance will limit the DD cell size, you can override this with -rconGuess
> for relative PME load: 0.26Will use 18 particle-particle and 6 PME only
> nodesThis is a guess, check the performance at the end of the log fileUsing
> 6 separate PME nodesScaling the initial minimum size with 1/0.8 (option
> -dds) = 1.25Optimizing the DD grid for 18 cells with a minimum initial size
> of 1.025 nmThe maximum allowed number of cells is: X 8 Y 8 Z 8Domain
> decomposition grid 3 x 2 x 3, separate PME nodes 6PME domain decomposition:
> 3 x 2 x 1Interleaving PP and PME nodesThis is a particle-particle only
> nodeDomain decomposition nodeid 0, coordinates 0 0 0"*
>
>
>
> 2014-02-23 18:08 GMT+01:00 Justin Lemkul <jalemkul at vt.edu>:
>
> >
> >
> > On 2/23/14, 11:32 AM, Marcelo Depólo wrote:
> >
> >> Maybe I should explain it better.
> >>
> >> I am using "*mpirun -np 24 mdrun -s prt.tpr -e prt.edr -o prt.trr*",
> >> pretty
> >>
> >> much a standard line. This job in a batch creates the outputs and, after
> >> some (random) time, a back up is done and new files are written, but the
> >> job itself do not finish.
> >>
> >>
> > It would help if you can post the .log file from one of the runs to see
> > the information regarding mdrun's parallel capabilities.  This still
> sounds
> > like a case of an incorrectly compiled binary.  Do other runs with the
> same
> > binary produce the same problem?
> >
> > -Justin
> >
> >
> >
> >> 2014-02-23 17:12 GMT+01:00 Justin Lemkul <jalemkul at vt.edu>:
> >>
> >>
> >>>
> >>> On 2/23/14, 11:00 AM, Marcelo Depólo wrote:
> >>>
> >>>  But it is not quite happening simultaneously, Justin.
> >>>>
> >>>> It is producing one after another and, consequently, backing up the
> >>>> files.
> >>>>
> >>>>
> >>>>  You'll have to provide the exact commands you're issuing.  Likely
> >>> you're
> >>> leaving the output names to the default, which causes them to be backed
> >>> up
> >>> rather than overwritten.
> >>>
> >>>
> >>> -Justin
> >>>
> >>> --
> >>> ==================================================
> >>>
> >>> Justin A. Lemkul, Ph.D.
> >>> Ruth L. Kirschstein NRSA Postdoctoral Fellow
> >>>
> >>> Department of Pharmaceutical Sciences
> >>> School of Pharmacy
> >>> Health Sciences Facility II, Room 601
> >>> University of Maryland, Baltimore
> >>> 20 Penn St.
> >>> Baltimore, MD 21201
> >>>
> >>> jalemkul at outerbanks.umaryland.edu | (410) 706-7441
> >>> http://mackerell.umaryland.edu/~jalemkul
> >>>
> >>> ==================================================
> >>> --
> >>> Gromacs Users mailing list
> >>>
> >>> * Please search the archive at http://www.gromacs.org/
> >>> Support/Mailing_Lists/GMX-Users_List before posting!
> >>>
> >>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>>
> >>> * For (un)subscribe requests visit
> >>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> >>> send a mail to gmx-users-request at gromacs.org.
> >>>
> >>>
> >>
> >>
> >>
> > --
> > ==================================================
> >
> > Justin A. Lemkul, Ph.D.
> > Ruth L. Kirschstein NRSA Postdoctoral Fellow
> >
> > Department of Pharmaceutical Sciences
> > School of Pharmacy
> > Health Sciences Facility II, Room 601
> > University of Maryland, Baltimore
> > 20 Penn St.
> > Baltimore, MD 21201
> >
> > jalemkul at outerbanks.umaryland.edu | (410) 706-7441
> > http://mackerell.umaryland.edu/~jalemkul
> >
> > ==================================================
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at http://www.gromacs.org/
> > Support/Mailing_Lists/GMX-Users_List before posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > send a mail to gmx-users-request at gromacs.org.
> >
>
>
>
> --
> Marcelo Depólo Polêto
> Uppsala Universitet - Sweden
> Science without Borders - CAPES
> Phone: +46 76 581 67 49
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>


More information about the gromacs.org_gmx-users mailing list