[gmx-users] Long trajectory split

Marcelo Depólo marcelodepolo at gmail.com
Thu Feb 27 13:34:01 CET 2014


Dear Dr,

Which details or files do you need? I would be very happy to solve this
question by posting any kind of files that you request.



2014-02-23 22:21 GMT+01:00 Dr. Vitaly Chaban <vvchaban at gmail.com>:

> You do not provide all the details. As was pointed at the very
> beginning, most likely you have incorrect parallelism in this case.
> Can you post all the files you obtain for people to inspect?
>
>
> Dr. Vitaly V. Chaban
>
>
> On Sun, Feb 23, 2014 at 9:04 PM, Marcelo Depólo <marcelodepolo at gmail.com>
> wrote:
> >  Justin, as far as I realized, the next log file starts at 0ps what would
> > mean that it is re-starting for some reason. At first, I imagined that it
> > was only splitting the data among files due to some kind of size limit,
> as
> > you said, but when I tried to concatenate the trajectories, it gives me a
> > non-sense output, with a lot of 'beginnings'.
> >
> > I will check with the cluster experts if there is some kind of size
> > limit.It seems to be the most logical source of the problem to me.
> >
> > Mark, the only difference this time is the time-scale set since the
> > beginning. Apart from the protein itself, even the .mdp files were copied
> > from a sucessful folder.
> >
> > But thank you both for the support.
> >
> >
> > 2014-02-23 20:20 GMT+01:00 Mark Abraham <mark.j.abraham at gmail.com>:
> >
> >> On Sun, Feb 23, 2014 at 6:48 PM, Marcelo Depólo <
> marcelodepolo at gmail.com
> >> >wrote:
> >>
> >> > Justin, the other runs with the very same binary do not produce the
> same
> >> > problem.
> >> >
> >> > Mark, I just omitted the _mpi of the line here, but is was compiled as
> >> > _mpi.
> >> >
> >>
> >> OK, that rules that problem out, but please don't simplify and
> approximate.
> >> Computers are exact, and trouble shooting problems with them requires
> all
> >> the information. If we all understood perfectly we wouldn't be having
> >> problems ;-)
> >>
> >> Those files do get closed at checkpoint intervals, so they can be hashed
> >> for the hash value to be saved in the checkpoint. It is conceivable some
> >> file system would not close-and-re-open them properly. The .log files
> would
> >> comment about at least some such conditions.
> >>
> >> But the real question is what you are doing differently from the times
> when
> >> you have observed normal behaviour!
> >>
> >> Mark
> >>
> >>
> >> > My log file top:
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > *Gromacs version:    VERSION 4.6.1Precision:          singleMemory
> >> > model:       64 bitMPI library:        MPIOpenMP support:
> disabledGPU
> >> > support:        disabledinvsqrt routine:    gmx_software_invsqrt(x)CPU
> >> > acceleration:   SSE4.1FFT library:        fftw-3.3.2-sse2Large file
> >> > support: enabledRDTSCP usage:       enabledBuilt on:           Sex
> Nov 29
> >> > 16:08:45 BRST 2013Built by:           root at jupiter [CMAKE]Build
> >> > OS/arch:      Linux 2.6.32.13-0.4-default x86_64Build CPU vendor:
> >> > GenuineIntelBuild CPU brand:    Intel(R) Xeon(R) CPU           X5650
>  @
> >> > 2.67GHzBuild CPU family:   6   Model: 44   Stepping: 2Build CPU
> features:
> >> > apic clfsh cmov cx8 cx16 htt lahf_lm mmx msr nonstop_tsc pcid pdcm
> >> pdpe1gb
> >> > popcnt pse rdtscp sse2 sse3 sse4.1 sse4.2 ssse3(...)*
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > *Initializing Domain Decomposition on 24 nodesDynamic load balancing:
> >> > autoWill sort the charge groups at every domain
> (re)decompositionInitial
> >> > maximum inter charge-group distances:    two-body bonded interactions:
> >> > 0.621 nm, LJ-14, atoms 3801 3812  multi-body bonded interactions:
> 0.621
> >> nm,
> >> > G96Angle, atoms 3802 3812Minimum cell size due to bonded interactions:
> >> > 0.683 nmMaximum distance for 5 constraints, at 120 deg. angles,
> >> all-trans:
> >> > 0.820 nmEstimated maximum distance required for P-LINCS: 0.820 nmThis
> >> > distance will limit the DD cell size, you can override this with
> >> -rconGuess
> >> > for relative PME load: 0.26Will use 18 particle-particle and 6 PME
> only
> >> > nodesThis is a guess, check the performance at the end of the log
> >> fileUsing
> >> > 6 separate PME nodesScaling the initial minimum size with 1/0.8
> (option
> >> > -dds) = 1.25Optimizing the DD grid for 18 cells with a minimum initial
> >> size
> >> > of 1.025 nmThe maximum allowed number of cells is: X 8 Y 8 Z 8Domain
> >> > decomposition grid 3 x 2 x 3, separate PME nodes 6PME domain
> >> decomposition:
> >> > 3 x 2 x 1Interleaving PP and PME nodesThis is a particle-particle only
> >> > nodeDomain decomposition nodeid 0, coordinates 0 0 0"*
> >> >
> >> >
> >> >
> >> > 2014-02-23 18:08 GMT+01:00 Justin Lemkul <jalemkul at vt.edu>:
> >> >
> >> > >
> >> > >
> >> > > On 2/23/14, 11:32 AM, Marcelo Depólo wrote:
> >> > >
> >> > >> Maybe I should explain it better.
> >> > >>
> >> > >> I am using "*mpirun -np 24 mdrun -s prt.tpr -e prt.edr -o
> prt.trr*",
> >> > >> pretty
> >> > >>
> >> > >> much a standard line. This job in a batch creates the outputs and,
> >> after
> >> > >> some (random) time, a back up is done and new files are written,
> but
> >> the
> >> > >> job itself do not finish.
> >> > >>
> >> > >>
> >> > > It would help if you can post the .log file from one of the runs to
> see
> >> > > the information regarding mdrun's parallel capabilities.  This still
> >> > sounds
> >> > > like a case of an incorrectly compiled binary.  Do other runs with
> the
> >> > same
> >> > > binary produce the same problem?
> >> > >
> >> > > -Justin
> >> > >
> >> > >
> >> > >
> >> > >> 2014-02-23 17:12 GMT+01:00 Justin Lemkul <jalemkul at vt.edu>:
> >> > >>
> >> > >>
> >> > >>>
> >> > >>> On 2/23/14, 11:00 AM, Marcelo Depólo wrote:
> >> > >>>
> >> > >>>  But it is not quite happening simultaneously, Justin.
> >> > >>>>
> >> > >>>> It is producing one after another and, consequently, backing up
> the
> >> > >>>> files.
> >> > >>>>
> >> > >>>>
> >> > >>>>  You'll have to provide the exact commands you're issuing.
>  Likely
> >> > >>> you're
> >> > >>> leaving the output names to the default, which causes them to be
> >> backed
> >> > >>> up
> >> > >>> rather than overwritten.
> >> > >>>
> >> > >>>
> >> > >>> -Justin
> >> > >>>
> >> > >>> --
> >> > >>> ==================================================
> >> > >>>
> >> > >>> Justin A. Lemkul, Ph.D.
> >> > >>> Ruth L. Kirschstein NRSA Postdoctoral Fellow
> >> > >>>
> >> > >>> Department of Pharmaceutical Sciences
> >> > >>> School of Pharmacy
> >> > >>> Health Sciences Facility II, Room 601
> >> > >>> University of Maryland, Baltimore
> >> > >>> 20 Penn St.
> >> > >>> Baltimore, MD 21201
> >> > >>>
> >> > >>> jalemkul at outerbanks.umaryland.edu | (410) 706-7441
> >> > >>> http://mackerell.umaryland.edu/~jalemkul
> >> > >>>
> >> > >>> ==================================================
> >> > >>> --
> >> > >>> Gromacs Users mailing list
> >> > >>>
> >> > >>> * Please search the archive at http://www.gromacs.org/
> >> > >>> Support/Mailing_Lists/GMX-Users_List before posting!
> >> > >>>
> >> > >>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >> > >>>
> >> > >>> * For (un)subscribe requests visit
> >> > >>>
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-usersor
> >> > >>> send a mail to gmx-users-request at gromacs.org.
> >> > >>>
> >> > >>>
> >> > >>
> >> > >>
> >> > >>
> >> > > --
> >> > > ==================================================
> >> > >
> >> > > Justin A. Lemkul, Ph.D.
> >> > > Ruth L. Kirschstein NRSA Postdoctoral Fellow
> >> > >
> >> > > Department of Pharmaceutical Sciences
> >> > > School of Pharmacy
> >> > > Health Sciences Facility II, Room 601
> >> > > University of Maryland, Baltimore
> >> > > 20 Penn St.
> >> > > Baltimore, MD 21201
> >> > >
> >> > > jalemkul at outerbanks.umaryland.edu | (410) 706-7441
> >> > > http://mackerell.umaryland.edu/~jalemkul
> >> > >
> >> > > ==================================================
> >> > > --
> >> > > Gromacs Users mailing list
> >> > >
> >> > > * Please search the archive at http://www.gromacs.org/
> >> > > Support/Mailing_Lists/GMX-Users_List before posting!
> >> > >
> >> > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >> > >
> >> > > * For (un)subscribe requests visit
> >> > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-usersor
> >> > > send a mail to gmx-users-request at gromacs.org.
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > Marcelo Depólo Polêto
> >> > Uppsala Universitet - Sweden
> >> > Science without Borders - CAPES
> >> > Phone: +46 76 581 67 49
> >> > --
> >> > Gromacs Users mailing list
> >> >
> >> > * Please search the archive at
> >> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> >> > posting!
> >> >
> >> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >> >
> >> > * For (un)subscribe requests visit
> >> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> >> > send a mail to gmx-users-request at gromacs.org.
> >> >
> >> --
> >> Gromacs Users mailing list
> >>
> >> * Please search the archive at
> >> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> >> posting!
> >>
> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>
> >> * For (un)subscribe requests visit
> >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> >> send a mail to gmx-users-request at gromacs.org.
> >>
> >
> >
> >
> > --
> > Marcelo Depólo Polêto
> > Uppsala Universitet - Sweden
> > Science without Borders - CAPES
> > Phone: +46 76 581 67 49
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>



-- 
Marcelo Depólo Polêto
Uppsala Universitet - Sweden
Science without Borders - CAPES
Phone: +46 76 581 67 49


More information about the gromacs.org_gmx-users mailing list