[gmx-developers] Following forces with domain decomposition

Berk Hess hess at kth.se
Tue Jun 9 21:57:39 CEST 2020


Hi,

I think Mark's suggestion is very good. This avoids all special 
communication for Drude oscillators. This only requires adding Drude 
connections to the update group setup.

Is what you are printing based on the global atom number? If not that 
explains a lot.

Cheers,

Berk

On 2020-06-09 20:26, Mark Abraham wrote:
> Hi,
>
> Perhaps not the solution you're looking for, but since 2019, DD has 
> been based on the notion of a domain being a compact collection of 
> update groups (which are indivisible units like -CH2-) rather than a 
> strict geometric criterion. That was done so that h-bond only 
> constraints need not communicate, but is probably also a good choice 
> for Drude+parent. You should still be able to validate the 
> single-domain cases with your old code based on a long-ago version.
>
> Mark
>
> On Tue, 9 Jun 2020 at 19:14, Justin Lemkul <jalemkul at vt.edu 
> <mailto:jalemkul at vt.edu>> wrote:
>
>     ...
>     Hi All,
>
>     I'm trying (once again) to get back into figuring out the
>     lingering bugs
>     with the Drude implementation when using domain decomposition.
>     Since I
>     last asked for help, I have gotten coordinate and velocity
>     communication
>     working properly. Now, I'm stuck on forces. To quickly recap the
>     issue,
>     it is possible that Drudes and their parent atoms get separated in
>     different domains. This requires communication of coordinates,
>     velocities, and forces via treatment as "special atoms" like is
>     the case
>     with virtual sites. As such, my implementation largely follows what
>     happens for the virtual sites (communicate after any update).
>
>     I have been tracing the forces at every step of do_force - basically
>     printing out the force on a Drude that I know is in a different
>     domain
>     from its parent atom. I use the OpenMP output as reference. I can
>     reproduce the OpenMP forces with domain decomposition but no
>     communication (e.g. gmx mdrun -ntmpi 2 -npme 1 -deffnm md -nb cpu),
>     based on Berk's suggestion from a long time ago. So the issue I'm
>     having
>     must be coming from communicating somewhere, but I can't nail it
>     down.
>     Here is an example of the output I'm looking at.
>
>     First, from OpenMP (my reference, the correct output):
>
>     === Step 0 ===
>     DO FORCE: top f[54] = 0.000000 0.000000 0.000000
>     DO FORCE: after do_nb_verlet #1 f[54] = 0.000000 0.000000 0.000000
>     DO FORCE: after do_nb_verlet #2 f[54] = 0.000000 0.000000 0.000000
>     DO FORCE: after nbnxn_atomdata_add_nbat_f_to_f f[54] = 1271.383667
>     -3106.622803 2148.540283
>     DO FORCE: after nbnxn_atomdata_add_nbat_fshift_to_fshift f[54] =
>     1271.383667 -3106.622803 2148.540283
>     DO FORCE: after do_force_lowlevel f[54] = 82.651733 130.833740
>     82.218506
>     DO FORCE: b4 move_f f[54] = 82.651733 130.833740 82.218506
>     DO FORCE: after move_f f[54] = 82.651733 130.833740 82.218506
>     DO FORCE: after GPU use/emulate f[54] = 82.651733 130.833740 82.218506
>     DO FORCE: after vsite_spread f[54] = 82.651733 130.833740 82.218506
>     DO FORCE: b4 post f[54] = 82.651733 130.833740 82.218506
>     DO FORCE: end f[54] = 58.264297 16.147758 43.956337
>     === Step 1 ===
>     DO FORCE: top f[54] = 58.264297 16.147758 43.956337
>     DO FORCE: after do_nb_verlet #1 f[54] = 0.000000 0.000000 0.000000
>     DO FORCE: after do_nb_verlet #2 f[54] = 0.000000 0.000000 0.000000
>     DO FORCE: after nbnxn_atomdata_add_nbat_f_to_f f[54] = 1205.647705
>     -3128.451904 2138.944580
>     DO FORCE: after nbnxn_atomdata_add_nbat_fshift_to_fshift f[54] =
>     1205.647705 -3128.451904 2138.944580
>     DO FORCE: after do_force_lowlevel f[54] = 200.794189 -175.644287
>     -279.924072
>     DO FORCE: b4 move_f f[54] = 200.794189 -175.644287 -279.924072
>     DO FORCE: after move_f f[54] = 200.794189 -175.644287 -279.924072
>     DO FORCE: after GPU use/emulate f[54] = 200.794189 -175.644287
>     -279.924072
>     DO FORCE: after vsite_spread f[54] = 200.794189 -175.644287
>     -279.924072
>     DO FORCE: b4 post f[54] = 200.794189 -175.644287 -279.924072
>     DO FORCE: end f[54] = 162.370026 -306.717041 -321.102356
>
>
>     Now, my implementation with domain decomposition:
>
>     === Step 0 ===
>     DO FORCE: top f[54] = 0.000000 0.000000 0.000000
>     DO FORCE: after do_nb_verlet #1 f[54] = 0.000000 0.000000 0.000000
>     DO FORCE: after do_nb_verlet #2 f[54] = 0.000000 0.000000 0.000000
>     DO FORCE: after nbnxn_atomdata_add_nbat_f_to_f f[54] = 338.912842
>     -2940.618164 2357.080078
>     DO FORCE: after do_force_lowlevel f[54] = 1899.546387 -1663.452881
>     1703.655273
>     DO FORCE: b4 move_f f[54] = 1899.546387 -1663.452881 1703.655273
>     DO FORCE: after move_f f[54] = 82.647949 130.835449 82.213165
>     DO FORCE: after GPU use/emulate f[54] = 82.647949 130.835449 82.213165
>     DO FORCE: after vsite_spread f[54] = 82.647949 130.835449 82.213165
>     DO FORCE: b4 post f[54] = 82.647949 130.835449 82.213165
>     DO FORCE: end f[54] = 58.260483 16.149330 43.951458
>     === Step 1 ===
>     DO FORCE: top f[54] = 58.260483 16.149330 43.951458
>     DO FORCE: after do_nb_verlet #1 f[54] = 0.000000 0.000000 0.000000
>     DO FORCE: after do_nb_verlet #2 f[54] = 0.000000 0.000000 0.000000
>     DO FORCE: after nbnxn_atomdata_add_nbat_f_to_f f[54] = 265.444092
>     -2965.024170 2346.120117
>     DO FORCE: after do_force_lowlevel f[54] = 1834.273926 -1685.225830
>     1654.119141
>     DO FORCE: b4 move_f f[54] = 1834.273926 -1685.225830 1654.119141
>     DO FORCE: after move_f f[54] = 258.300781 -122.286865 -219.277039
>     DO FORCE: after GPU use/emulate f[54] = 258.300781 -122.286865
>     -219.277039
>     DO FORCE: after vsite_spread f[54] = 258.300781 -122.286865
>     -219.277039
>     DO FORCE: b4 post f[54] = 258.300781 -122.286865 -219.277039
>     DO FORCE: end f[54] = 229.446487 -248.274734 -255.144485
>
>      From this output, I can see that communication works in step 0 and
>     between steps 0 and 1, since the force is correctly propagated. I
>     also
>     do not know to what extent I can expect forces to match before the
>     "move_f" step (which is where I communicate non-local Drude forces
>     and
>     follows the existing "dd_move_f" in do_force_cutsVERLET). But the
>     forces
>     should certainly be the same after communicating so they are
>     correctly
>     input to post_process_forces.
>
>     Can anyone suggest how the code paths might differ between these two
>     steps? I've debugged every step along the way that I can figure
>     out and
>     all I can come up with is that the forces end up different. I know
>     that
>     may be a big request without seeing the code, but I'm simply
>     determining
>     non-local Drudes the same way we do with vsites, and communicating
>     their
>     forces with the existing dd_move_f_specat function that vsites
>     also use.
>
>     Any help would be greatly appreciated. I've been stuck on this
>     forever
>     and it is clear that our user community really wants this feature.
>     I can
>     give them OpenMP easily, but that's rather restrictive...
>
>     -Justin
>
>     -- 
>     ==================================================
>
>     Justin A. Lemkul, Ph.D.
>     Assistant Professor
>     Office: 301 Fralin Hall
>     Lab: 303 Engel Hall
>
>     Virginia Tech Department of Biochemistry
>     340 West Campus Dr.
>     Blacksburg, VA 24061
>
>     jalemkul at vt.edu <mailto:jalemkul at vt.edu> | (540) 231-3129
>     http://www.thelemkullab.com
>
>     ==================================================
>
>     -- 
>     Gromacs Developers mailing list
>
>     * Please search the archive at
>     http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List
>     before posting!
>
>     * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
>     * For (un)subscribe requests visit
>     https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
>     or send a mail to gmx-developers-request at gromacs.org
>     <mailto:gmx-developers-request at gromacs.org>.
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20200609/fd77973a/attachment-0001.html>


More information about the gromacs.org_gmx-developers mailing list