[gmx-developers] Re: flushing files

Berk Hess hess at cbr.su.se
Wed Oct 13 11:23:42 CEST 2010


That probably depends on if you run on 10 or 100000 nodes.

I just discussed the flushing with Erik.
I forgot the motivation for this, but it was to have only whole frames
disks.
If you don't flush, you'll often have partial frames.
So the options are: or flush every frame or buffer and then flush or fsync.
For the mpi i/o this is no issue, since we buffer internally, we can simply
on fsync on write, which should happen at least when checkpointing.

I guess the only remaining question is if flushing could be slow under
circumstances where we would not want to use the mpi buffered i/o?

Berk

On 10/13/2010 11:18 AM, Sander Pronk wrote:
> Just out of curiosity: how long does MPI_File_sync take?
>
> Sander
>
>
> On Oct 13, 2010, at 10:07 , Roland Schulz wrote:
>
>>
>>
>> On Wed, Oct 13, 2010 at 3:35 AM, Erik Lindahl <lindahl at cbr.su.se
>> <mailto:lindahl at cbr.su.se>> wrote:
>>
>>     Hi,
>>     On Oct 13, 2010, at 9:23 AM, Roland Schulz wrote:
>>>
>>>     This is not what we are doing at the moment. At the moment
>>>     (flush after frame, sync after checkpoint) it is possible that
>>>     the trajectory is broken. But the check-pointing append
>>>     feature guarantees that it automatically fixes it. I like the
>>>     approach of fast writing + automatic fix in the worst case
>>>     better than having to guarantee that it is always correct from
>>>     the beginning. Also it would be extremely difficult
>>>     to guarantee it for all cases (e.g. for the case of a crash
>>>     during writing of a frame). 
>>
>>     Yes, but that's a huge difference: Presently you might get broken
>>     frames if your simulation crashes. If you are on a file system
>>     that never flushes to disk with fflush() you won't get frames on
>>     the frontend, but at least they aren't broken.
>>
>>
>> I think, it is also currently possible (but unlikely) that the
>> trajectory appears broken. While a frame is written it is possible
>> (I'm pretty sure I encountered that before). But I see the point that
>> we at least want to make it as unlikely (most of the time it is not
>> currently writing) as possible without affecting the performance.
>>
>> This might actually not be a problem with MPI-IO because we buffer
>> the whole frame in memory and then have one MPI_File_write call for
>> the whole frame (or more precise a MPI_File_write_ordered for a
>> couple of frame). Thus because we always write a whole frame in one
>> go it should not be an issue. We'll test to make sure.
>> If it is still an issue we can buffer more frames to not cause a
>> performance problem with MPI_File_sync after each write.
>>
>> Independent of my original question and the CollectiveIO work, we
>> might want to make sure that we guarantee to fsync every 15min, even
>> when we don't checkpoint or only checkpoint infrequent. This might be
>> a fix we want to add to the release branch.
>>
>> Roland
>>
>> -- 
>> gmx-developers mailing list
>> gmx-developers at gromacs.org <mailto:gmx-developers at gromacs.org>
>> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>> Please don't post (un)subscribe requests to the list. Use the
>> www interface or send it to gmx-developers-request at gromacs.org.
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20101013/7cad0afd/attachment.html>


More information about the gromacs.org_gmx-developers mailing list