[gmx-developers] Following forces with domain decomposition

David van der Spoel spoel at xray.bmc.uu.se
Wed Jun 10 08:38:00 CEST 2020


Den 2020-06-09 kl. 20:26, skrev Mark Abraham:
> Hi,
> 
> Perhaps not the solution you're looking for, but since 2019, DD has been 
> based on the notion of a domain being a compact collection of update 
> groups (which are indivisible units like -CH2-) rather than a strict 
> geometric criterion. That was done so that h-bond only constraints need 
> not communicate, but is probably also a good choice for Drude+parent. 
> You should still be able to validate the single-domain cases with your 
> old code based on a long-ago version.

That sounds a lot like old style charge groups. OUt of curiosity: how 
does this affect the need to have atoms in the box for PME?

> Mark
> 
> On Tue, 9 Jun 2020 at 19:14, Justin Lemkul <jalemkul at vt.edu 
> <mailto:jalemkul at vt.edu>> wrote:
> 
> 
>     Hi All,
> 
>     I'm trying (once again) to get back into figuring out the lingering
>     bugs
>     with the Drude implementation when using domain decomposition. Since I
>     last asked for help, I have gotten coordinate and velocity
>     communication
>     working properly. Now, I'm stuck on forces. To quickly recap the issue,
>     it is possible that Drudes and their parent atoms get separated in
>     different domains. This requires communication of coordinates,
>     velocities, and forces via treatment as "special atoms" like is the
>     case
>     with virtual sites. As such, my implementation largely follows what
>     happens for the virtual sites (communicate after any update).
> 
>     I have been tracing the forces at every step of do_force - basically
>     printing out the force on a Drude that I know is in a different domain
>     from its parent atom. I use the OpenMP output as reference. I can
>     reproduce the OpenMP forces with domain decomposition but no
>     communication (e.g. gmx mdrun -ntmpi 2 -npme 1 -deffnm md -nb cpu),
>     based on Berk's suggestion from a long time ago. So the issue I'm
>     having
>     must be coming from communicating somewhere, but I can't nail it down.
>     Here is an example of the output I'm looking at.
> 
>     First, from OpenMP (my reference, the correct output):
> 
>     === Step 0 ===
>     DO FORCE: top f[54] = 0.000000 0.000000 0.000000
>     DO FORCE: after do_nb_verlet #1 f[54] = 0.000000 0.000000 0.000000
>     DO FORCE: after do_nb_verlet #2 f[54] = 0.000000 0.000000 0.000000
>     DO FORCE: after nbnxn_atomdata_add_nbat_f_to_f f[54] = 1271.383667
>     -3106.622803 2148.540283
>     DO FORCE: after nbnxn_atomdata_add_nbat_fshift_to_fshift f[54] =
>     1271.383667 -3106.622803 2148.540283
>     DO FORCE: after do_force_lowlevel f[54] = 82.651733 130.833740 82.218506
>     DO FORCE: b4 move_f f[54] = 82.651733 130.833740 82.218506
>     DO FORCE: after move_f f[54] = 82.651733 130.833740 82.218506
>     DO FORCE: after GPU use/emulate f[54] = 82.651733 130.833740 82.218506
>     DO FORCE: after vsite_spread f[54] = 82.651733 130.833740 82.218506
>     DO FORCE: b4 post f[54] = 82.651733 130.833740 82.218506
>     DO FORCE: end f[54] = 58.264297 16.147758 43.956337
>     === Step 1 ===
>     DO FORCE: top f[54] = 58.264297 16.147758 43.956337
>     DO FORCE: after do_nb_verlet #1 f[54] = 0.000000 0.000000 0.000000
>     DO FORCE: after do_nb_verlet #2 f[54] = 0.000000 0.000000 0.000000
>     DO FORCE: after nbnxn_atomdata_add_nbat_f_to_f f[54] = 1205.647705
>     -3128.451904 2138.944580
>     DO FORCE: after nbnxn_atomdata_add_nbat_fshift_to_fshift f[54] =
>     1205.647705 -3128.451904 2138.944580
>     DO FORCE: after do_force_lowlevel f[54] = 200.794189 -175.644287
>     -279.924072
>     DO FORCE: b4 move_f f[54] = 200.794189 -175.644287 -279.924072
>     DO FORCE: after move_f f[54] = 200.794189 -175.644287 -279.924072
>     DO FORCE: after GPU use/emulate f[54] = 200.794189 -175.644287
>     -279.924072
>     DO FORCE: after vsite_spread f[54] = 200.794189 -175.644287 -279.924072
>     DO FORCE: b4 post f[54] = 200.794189 -175.644287 -279.924072
>     DO FORCE: end f[54] = 162.370026 -306.717041 -321.102356
> 
> 
>     Now, my implementation with domain decomposition:
> 
>     === Step 0 ===
>     DO FORCE: top f[54] = 0.000000 0.000000 0.000000
>     DO FORCE: after do_nb_verlet #1 f[54] = 0.000000 0.000000 0.000000
>     DO FORCE: after do_nb_verlet #2 f[54] = 0.000000 0.000000 0.000000
>     DO FORCE: after nbnxn_atomdata_add_nbat_f_to_f f[54] = 338.912842
>     -2940.618164 2357.080078
>     DO FORCE: after do_force_lowlevel f[54] = 1899.546387 -1663.452881
>     1703.655273
>     DO FORCE: b4 move_f f[54] = 1899.546387 -1663.452881 1703.655273
>     DO FORCE: after move_f f[54] = 82.647949 130.835449 82.213165
>     DO FORCE: after GPU use/emulate f[54] = 82.647949 130.835449 82.213165
>     DO FORCE: after vsite_spread f[54] = 82.647949 130.835449 82.213165
>     DO FORCE: b4 post f[54] = 82.647949 130.835449 82.213165
>     DO FORCE: end f[54] = 58.260483 16.149330 43.951458
>     === Step 1 ===
>     DO FORCE: top f[54] = 58.260483 16.149330 43.951458
>     DO FORCE: after do_nb_verlet #1 f[54] = 0.000000 0.000000 0.000000
>     DO FORCE: after do_nb_verlet #2 f[54] = 0.000000 0.000000 0.000000
>     DO FORCE: after nbnxn_atomdata_add_nbat_f_to_f f[54] = 265.444092
>     -2965.024170 2346.120117
>     DO FORCE: after do_force_lowlevel f[54] = 1834.273926 -1685.225830
>     1654.119141
>     DO FORCE: b4 move_f f[54] = 1834.273926 -1685.225830 1654.119141
>     DO FORCE: after move_f f[54] = 258.300781 -122.286865 -219.277039
>     DO FORCE: after GPU use/emulate f[54] = 258.300781 -122.286865
>     -219.277039
>     DO FORCE: after vsite_spread f[54] = 258.300781 -122.286865 -219.277039
>     DO FORCE: b4 post f[54] = 258.300781 -122.286865 -219.277039
>     DO FORCE: end f[54] = 229.446487 -248.274734 -255.144485
> 
>       From this output, I can see that communication works in step 0 and
>     between steps 0 and 1, since the force is correctly propagated. I also
>     do not know to what extent I can expect forces to match before the
>     "move_f" step (which is where I communicate non-local Drude forces and
>     follows the existing "dd_move_f" in do_force_cutsVERLET). But the
>     forces
>     should certainly be the same after communicating so they are correctly
>     input to post_process_forces.
> 
>     Can anyone suggest how the code paths might differ between these two
>     steps? I've debugged every step along the way that I can figure out and
>     all I can come up with is that the forces end up different. I know that
>     may be a big request without seeing the code, but I'm simply
>     determining
>     non-local Drudes the same way we do with vsites, and communicating
>     their
>     forces with the existing dd_move_f_specat function that vsites also use.
> 
>     Any help would be greatly appreciated. I've been stuck on this forever
>     and it is clear that our user community really wants this feature. I
>     can
>     give them OpenMP easily, but that's rather restrictive...
> 
>     -Justin
> 
>     -- 
>     ==================================================
> 
>     Justin A. Lemkul, Ph.D.
>     Assistant Professor
>     Office: 301 Fralin Hall
>     Lab: 303 Engel Hall
> 
>     Virginia Tech Department of Biochemistry
>     340 West Campus Dr.
>     Blacksburg, VA 24061
> 
>     jalemkul at vt.edu <mailto:jalemkul at vt.edu> | (540) 231-3129
>     http://www.thelemkullab.com
> 
>     ==================================================
> 
>     -- 
>     Gromacs Developers mailing list
> 
>     * Please search the archive at
>     http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List
>     before posting!
> 
>     * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> 
>     * For (un)subscribe requests visit
>     https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
>     or send a mail to gmx-developers-request at gromacs.org
>     <mailto:gmx-developers-request at gromacs.org>.
> 
> 


-- 
David van der Spoel, Ph.D.,
Professor of Biology
Uppsala University.
http://virtualchemistry.org


More information about the gromacs.org_gmx-developers mailing list