[gmx-users] Why does the -append option exist?

Sun Jun 5 09:42:33 CEST 2011

On Sun, Jun 5, 2011 at 2:14 AM, Mark Abraham <Mark.Abraham at anu.edu.au>wrote:

>  On 5/06/2011 12:31 PM, Dimitar Pachov wrote:
>
> As I said, the queue is like this: you submit the job, it finds an empty
> node, it goes there, however seconds later another user with
> higher privileges on that particular node submits a job, his job kicks out
> my job, mine goes on the queue again, it finds another empty node, goes
> there, then another user with high privileges on that node submits a job,
> which consequently kicks out my job again, and the cycle repeats itself ...
> theoretically, it could continue forever, depending on how many and where
> the empty nodes are, if any.
>
>
> You've said that *now* - but previously you've said nothing about why you
> were getting lots of restarts. In my experience, PBS queues suspend jobs
> rather than deleting them, in order that resources are not wasted.
> Apparently other places do things this way. I think that this information is
> highly relevant to explaining your observations.
>
>

The point was not "why" I was getting the restarts, but the fact itself that
I was getting restarts close in time, as I stated in my first post. I
actually also don't know whether jobs are deleted or suspended. I've thought
that a job returned back to the queue will basically start from the
beginning when later moved to an empty slot ... so don't understand the
difference from that perspective.

>
>  These many restarts suggest that the queue was full with relatively short
> jobs ran by users with high privileges. Technically, I cannot see why the
> same processes should be running simultaneously because at any instant my
> job runs only on one node, or it stays in the queuing list.
>
>
> I/O can be buffered such that the termination of the process and the
> completion of its I/O are asynchronous. Perhaps it *shouldn't* be that way,
> but this is a problem for the administrators of your cluster to address.
> They know how the file system works. If the next job executes before the old
> one has finished output, then I think the symptoms you observe might be
> possible.
>

Yes, this is true, and I believe the timing of when the buffer is fully
flushed is crucial in providing a possible explanation in the
observed behavior. However, this bottleneck has been known for a long time,
so I expected people had thought about that before confidently putting
-append as a default. That's all.

>
> Note that there is nothing GROMACS can do about that, unless somehow
> GROMACS can apply a lock in the first mdrun that is respected by your file
> system such that a subsequent mdrun cannot open the same file until all
> pending I/O has completed. I'd expect proper HPC file systems do that
> automatically, but I don't really know.
>
>
I am not an expert nor do I know the Gromacs coding, but could one have an
option to specify certain timing before which Gromacs is prohibited to
output/write any files after its initial start, i.e. some kind of suspension
and/or waiting period?

I am also wondering about the checkpoint timing - the default is 15 min, but
what would be the minimum? Since I have not tested it, what would happen if
I specify 0.001 min, for example?

>
>
>>  From md-1-2360.out:
>> =====================================
>>  :::::::
>>   Getting Loaded...
>> Reading file run1.tpr, VERSION 4.5.4 (single precision)
>>
>>  Reading checkpoint file run1.cpt generated: Tue May 31 10:45:22 2011
>>
>>
>>  Loaded with Money
>>
>>  Making 2D domain decomposition 4 x 2 x 1
>>
>>  WARNING: This run will generate roughly 4915 Mb of data
>>
>>  starting mdrun 'run1'
>> 100000000 steps, 200000.0 ps (continuing from step 51879590, 103759.2 ps).
>>  =====================================
>>
>>
>>  These aren't showing anything other than that the restart is coming from
>> the same point each time.
>>
>>
>>  And from the last generated output md-1-2437.out (I think I killed the
>> job at that point because of the above observed behavior):
>> =====================================
>>  :::::::
>>  Getting Loaded...
>> Reading file run1.tpr, VERSION 4.5.4 (single precision)
>>  =====================================
>>
>>  I have at least 5-6 additional examples like this one. In some of them
>> the *xtc file does have size greater than zero yet still very small, but it
>> starts from some random frame (for example, in one of the cases it contains
>> frames from ~91000ps to ~104000ps, but all frames before 91000ps are
>> missing).
>>
>>
>>  I think that demonstrating a problem requires that the set of output
>> files were fine before one particular restart, and weird afterwards. I don't
>> think we've seen that yet.
>>
>>
>  I don't understand your point here. I am providing you with all info I
> have. I am showing the output files of 3 restarts, and they are different in
> a sense that the last two did not progress further enough before another job
> restart occurred. The first was fine before the restart, and the others were
> not exactly fine after the restart. At this point I realize that what I call
> "restart" and what you call "restart" might be two different things. And
> here is where the problem might be lying.
>
>
>
>>
>>  I realize there might be another problem, but the bottom line is that
>> there is no mechanism that can prevent this from happening if many restarts
>> are required, and particularly if the timing between these restarts is prone
>> to be small (distributed computing could easily satisfy this condition).
>>
>>  Any suggestions, particularly related to the minimum resistance path to
>> regenerate the missing data? :)
>>
>>
>>
>>>
>>>
>>>
>>> Using the checkpoint capability & appending make sense when many restarts
>>> are expected, but unfortunately it is exactly then when these options
>>> completely fail! As a new user of Gromacs, I must say I am disappointed, and
>>> would like to obtain an explanation of why the usage of these options is
>>> clearly stated to be safe when it is not, and why the append option is the
>>> default, and why at least a single warning has not been posted anywhere in
>>> the docs & manuals?
>>>
>>>
>>>  I can understand and sympathize with your frustration if you've
>>> experienced the loss of a simulation. Do be careful when suggesting that
>>> others' actions are blame-worthy, however.
>>>
>>
>>  I have never suggested this. As a user, I am entitled to ask.
>>
>>
>>  Sure. However, talking about something that can "completely fail"
>>
>
>  This is a fact, backed up by my evidences => I don't see anything bad
> directed to anybody.
>
>
>>  which makes you "disappointed"
>>
>
>  This is me being honest => again not related to anybody else.
>
>
>>  and wanting to "obtain an explanation"
>>
>
>  Well, this even is funny :) - many people want this, especially in
> science. Is that bad?
>
>
>>  about why something doesn't work as stated and lacks "a single warning"
>>
>
>  Again a fact => again nothing bad here.
>
>
>>  suggests that someone has done something less than appropriate
>>
>
>  This is a completely personal interpretation, and I am personally not
> responsible of how people perceive information. For unknown to me reason you
> moved into a very defensive mode. What could I do?
>
>
>> , and so blame-worthy. It also assumes that the actions of a new user were
>> correct, and the actions of a developer with long experience were not.
>>
>
>  Sorry, this is too much. Where was this suggested? It seems to me you
> took it too personally.
>
>
>> This may or may not prove to be true. Starting such a discussion from a
>> conciliatory (rather than antagonistic) stance is usually more productive.
>> The shared objective should be to fix the problem, not prove that someone
>> did something wrong.
>>
>
>  Agree, and I did it. Again, your perception does not seem to be
> correlated with my intended approach.
>
>
> Words are open to interpretation. Communicating well requires that you
> consider the impact of your words on your reader. You want people who can
> address the problem to want to help. You don't want them to feel defensive
> about the situation - whether you think that would be an over-reaction or
> not.
>
>
I got your point(s). However, I respectfully disagree with some of them.
First, I believe it is much more important what information one's sentences
bring rather than how specifically they are written. People are different,
and hence, one cannot be like a chameleon constantly changing their ways of
expressing themselves in order to adapt everybody's perceptions at any time
for the sake of always satisfying the other side. Such a constant
consideration of others is impossible, and definitely doesn't distinguish
you as an individual as well as doesn't make you feel like a free person.
Second, the impact of somebody's words is almost totally dependent on the
listener/reader, and much less dependent on the person who communicates
them. For example, you cannot persuade anybody in anything if that anybody
doesn't decide for themselves to accept your words. How good you are
in persuasion & communication is irrelevant. Third, the word "tone"
generally refers to verbal communication and it is a very subjective call,
the least to say, to judge "the tone" of written communication. Forth,
cultural & expression differences may arise when people communicate on
non-native languages, and this cannot be just ignored.

Therefore, I acknowledge the diversity of ways of expressing ourselves, and
try not to experience negative feelings while reading plain written words on
technical matters; otherwise, I would become unnecessary biased and
prejudiced without having any real objective reason in being such.

Thanks,
Dimitar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20110605/decde0de/attachment.html>