[gmx-users] Why does the -append option exist?

Roland Schulz roland at utk.edu
Mon Jun 6 00:20:23 CEST 2011


Two comments about the discussion:

1) I agree that buffered output (Kernel buffers - not application buffers)
should not affect I/O. If it does it should be filed as bug to the OS. Maybe
someone can write a short test application which tries to reproduce this
idea. Thus writing to a file from one node and immediate after one test
program is killed on one node writing to it from some other node.

2) We lock files but only the log file. The idea is that we only need
to guarantee that the set of files is only accessed by one application. This
seems safe but in case someone sees a way of how the trajectory is opened
without the log file being opened, please file a bug.

Roland

On Sun, Jun 5, 2011 at 10:13 AM, Mark Abraham <Mark.Abraham at anu.edu.au>wrote:

>  On 5/06/2011 11:08 PM, Francesco Oteri wrote:
>
> Dear Dimitar,
> I'm following the debate regarding:
>
>
>    The point was not "why" I was getting the restarts, but the fact itself
> that I was getting restarts close in time, as I stated in my first post. I
> actually also don't know whether jobs are deleted or suspended. I've thought
> that a job returned back to the queue will basically start from the
> beginning when later moved to an empty slot ... so don't understand the
> difference from that perspective.
>
>
> In the second mail yoo say:
>
>  Submitted by:
> ========================
> ii=1
> ifmpi="mpirun -np $NSLOTS"
> --------
>    if [ ! -f run${ii}-i.tpr ];then
>        cp run${ii}.tpr run${ii}-i.tpr
>       tpbconv -s run${ii}-i.tpr -until 200000 -o run${ii}.tpr
>    fi
>
>     k=`ls md-${ii}*.out | wc -l`
>    outfile="md-${ii}-$k.out"
>    if [[ -f run${ii}.cpt ]]; then
>
>       * $ifmpi `which mdrun` *-s run${ii}.tpr -cpi run${ii}.cpt -v -deffnm
> run${ii} -npme 0 > $outfile  2>&1
>
>     fi
>  =========================
>
>
> If I understand well, you are submitting the SERIAL  mdrun. This means that
> multiple instances of mdrun are running at the same time.
> Each instance of mdrun is an INDIPENDENT instance. Therefore checkpoint
> files, one for each instance (i.e. one for each CPU),  are written at the
> same time.
>
>
> Good thought, but Dimitar's stdout excerpts from early in the thread do
> indicate the presence of multiple execution threads. Dynamic load balancing
> gets turned on, and the DD is 4x2x1 for his 8 processors. Conventionally,
> and by default in the installation process, the MPI-enabled binaries get an
> "_mpi" suffix, but it isn't enforced - or enforceable :-)
>
> Mark
>
> --
> gmx-users mailing list    gmx-users at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-request at gromacs.org.
> Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>



-- 
ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
865-241-1537, ORNL PO BOX 2008 MS6309
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20110605/40410146/attachment.html>


More information about the gromacs.org_gmx-users mailing list