[gmx-developers] Re: flushing files
Janne.Blomqvist at tkk.fi
Wed Oct 13 14:09:15 CEST 2010
Sander Pronk wrote:
> On Oct 13, 2010, at 11:23 , Berk Hess wrote:
>> I just discussed the flushing with Erik.
>> I forgot the motivation for this, but it was to have only whole frames disks.
>> If you don't flush, you'll often have partial frames.
>> So the options are: or flush every frame or buffer and then flush or fsync.
>> For the mpi i/o this is no issue, since we buffer internally, we can simply
>> on fsync on write, which should happen at least when checkpointing.
>> I guess the only remaining question is if flushing could be slow under
>> circumstances where we would not want to use the mpi buffered i/o?
> There are a couple of circumstance that could trigger that:
> - we're writing to an unbuffered file system (nfs in synchronious mode, for example).
> - the OS runs out of disk cache (i.e. RAM) and is forced to write out to disk for each write() call. If this happens, there are bigger problems to worry about for the user.
> the first case could be real (due to an overzealous system administrator).
For NFSv2, yes. But, NFSv3 introduced the concept of "unstable" writes,
and a separate COMMIT operation. The server is allowed to acknowledge an
unstable write as soon as the data is in server memory, no need to flush
it to nonvolatile storage. The COMMIT operation, then, instructs the
server to flush any dirty data for the file in question to nonvolatile
storage. As one can probably guess from the above, a fsync() syscall is
in practice translated into a bunch of unstable writes (i.e. copying
dirty data from the client to the server) followed by a COMMIT.
Alternatively, the client can issue writes with the stable bit set,
making the COMMIT unnecessary (this is then equivalent to how NFSv2
worked back in the day). My guess is that in practice these stable
writes are used only for files opened with O_SYNC or similar.
Now, due to the NFS consistency model, any dirty data must be flushed to
stable storage when a file is closed on the client, so COMMIT is still
used even if there are no explicit fsync()'s.
My understanding is that the reason why sync exports still perform worse
than async ones is that async treats COMMIT's as no-ops. Although the
performance difference should be much smaller than it was with NFSv2.
That being said, I don't really see how an fsync() every 15 minutes
could be an issue. If calling fsync() for every frame is too expensive,
couldn't that be worked around e.g. by making the code robust enough to
not crash on short frames, or maybe include a checksum of the frame data
in the frame header to guard against otherwise corrupted frames? At
least for my own usage, losing a few frames in the unlikely event of a
crash is no big deal, as long as I don't lose everything (and
checkpoints every 15 min should ensure I never lose more than 15 min
More information about the gromacs.org_gmx-developers