[gmx-users] System volume "jumps" on exact continuations

Szilárd Páll pall.szilard at gmail.com
Thu Jun 1 22:43:01 CEST 2017


On Thu, Jun 1, 2017 at 9:39 PM, Elizabeth Ploetz <ploetz at ksu.edu> wrote:

> However, if most runs are group scheme, a quick check could show whether
>
> jumps are present in runs that i) do PP-PME tuning ii) if logs go truncated
> during continuation at least whether they do use separate PME ranks
> (because otherwise CPU-only runs don't tune).
>
> i) If grepping "timed" from the LOG file does not give any output, does
> that mean there was no PP-PME tuning? (Sorry for the stupid question. I'm
> not sure which piece of information from the LOG file is going to answer
> whether or not there was PP-PME tuning.)


Do you run with -append? If so, the log file too gets truncated, but I do
not recall exactly where and whether the PP-PME balancing messages are
removed or not, but it's not hard to try -- just run with separate PME and
too few of them (e.g. 1 out of 12) and that will trigger load balancing.

On a second thought, instead of testing with Verlet, you might want to just
do the above and try to directly observe the anomalies after the balancer.


> If so, perhaps there is a correlation between having PP-PME tuning and
> having a jump. Please see this link<http://i1243.photobucket.
> com/albums/gg545/ploetz/volumeJumps_zps8hmlghtn.png>. *If* the volume for
> 40-60ns of row 3 is the correct system volume, then all the data in this
> figure is consistent with there being a jump when there is PP-PME tuning.
> (Please note that while the data at 1 bar looks okay in this case, and
> elevated pressures do not, this is not always true. We get jumps at 1 bar
> as well sometimes.)
> ii) These are all CPU-only runs. The simulations always use separate PME
> ranks.
> Please let me know if any particular data from the LOG file would be
> helpful.
>

It would be easier if you provided logs that we can look through.


>
> If I understood correctly, it's only group scheme runs where this has been
> observed, so it could be some newer feature/change that interacts badly
> with the group scheme.
>
> You are correct, so far we have not seen any jumps with Verlet.
>
> BTW, do you have any data with 4.5?
>
> I have a few old simulations with version 4.5.3 (none with 4.5, sorry).
> They were all ran with inexact continuations (i.e., I did not provide
> checkpoint files when running multiple short runs to create one long
> simulation) or single trajectories that I had killed at various points and
> then continued using checkpoint files and -append. I don't have a huge data
> set with 4.5.3, but none of them exhibited jumps!
>
> I'd suggest that (especially if if investigation of current data does not
> reveal the reasons) pick a setup where you seemed to get the anomaly and
> run with the same settings using the Verlet scheme lots of short runs with
> restarts in a loop.
>
> Thanks, we are doing this test.
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/Support
> /Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>


More information about the gromacs.org_gmx-users mailing list