[gmx-developers] Hacking domain decomposition for long range bonded terms.

Berk Hess hess at kth.se
Tue Nov 19 07:50:32 CET 2013


Hi,

I think the hidden mdrun option -ddbondcomm should solve your issues. 
Then the communication should always work up to 2x2x2.
I don't know why I made this option hidden. We should probably make this 
a normal option. We could also hint at the option in the error message 
you quote.

What will also help with your issue is the Verlet scheme. I uploaded 
change https://gerrit.gromacs.org/#/c/2775/ which adds support for LJ 
shift (through vdw-modifier=Force-switch) to the Verlet scheme. This 
will make Martini run twice as fast and allows you to use OpenMP only 
parallelization on a single node (which is like PD, bu faster). On 
multiple nodes you can run MPI+OpenMP parallelization, which will allow 
you to run less domains and avoid the issue in many cases. Note that I 
have not (and will not) implement shift functions for Coulomb, but I 
heard from Groningen that reaction-field should work OK with Martini.

Cheers,

Berk

On 11/18/2013 09:59 PM, XAvier Periole wrote:
>
> Dears,
>
> Following on the issue of long range bonded terms preventing the use 
> of DD and before fixed using PD.
>
> We have looked into the groupcoord.c solution but are currently stuck. 
> Our idea was to supplement at each step the local atom list of each 
> node with the atoms involved in those long range bonded terms (total 
> of six atoms). We have bypassed the check on the minimum DD box size 
> (15 nm) by forcing the code to take the decomposition we choose. Here 
> 2x2x2.
>
> Of course now mdrun crashes when getting to calculate those long range 
> bonded interactions and complains that they are missing. The error 
> message is given bellow. We have trouble finding where exactly the 
> code determines which coordinates to communicate to each domain.
>
> Any suggestion or hint would be greatly appreciated.
>
> XAvier and Manel.
>
>> A list of missing interactions:
> Bond of  60636 missing      1
> Improper Dih. of     12 missing      2
>
> Molecule type 'rhodimmer'
> the first 10 missing interactions, except for exclusions:
> Improper Dih. atoms  117  423  279 1066 global   117   423 279  1066
> Bond atoms  279 1066           global   279  1066
> Improper Dih. atoms  279 1066 1210  904 global   279  1066 1210   904
>
> -------------------------------------------------------
> Program mdrun, VERSION 4.6.3
> Source code file: /src/gromacs-4.6.3/src/mdlib/domdec_top.c, line: 393
>
> Fatal error:
> 3 of the 103492 bonded interactions could not be calculated because 
> some atoms involved moved further apart than the multi-body cut-off 
> distance (1.4 nm) or the two-body cut-off distance (1.4 nm), see 
> option -rdd, for pairs and tabulated bonds also see option -ddcheck
> For more information and tips for troubleshooting, please check the 
> GROMACS
> website at http://www.gromacs.org/Documentation/Errors
> -------------------------------------------------------
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20131119/b2570c3a/attachment.html>


More information about the gromacs.org_gmx-developers mailing list