[gmx-developers] domain decomposition issue
francesco oteri
francesco.oteri at gmail.com
Wed Aug 15 16:41:07 CEST 2012
Thank you Mark,
I am trying to implement the suggestion 1.
I exchange the states between two simulations using a copy, and actually it
works.
But the step after the exchanging is executed, I still get error that are
recovered using
bExchanged = TRUE.
I guess dd_partition_system modifies the local state but also other
variables, so I tried
to run dd_partition_system with the original states but still
only bExchanged = TRUE
permits me to run without error.
Here is the new code:
if (DOMAINDECOMP(cr))
{
int old_flag = state->flags;
state->flags=(1<<estX); //I am interested only in coordinates
state_copy->flags=(1<<estX);
dd_collect_state(cr->dd,state,state_global_copy);
state->flags=old_flag;
}
GMX_BARRIER(cr->mpi_comm_mygroup);
if (MASTER(cr))
{
exchange_state(cr->ms, Y,state_global_copy); //Now state_global_copy
contains the state_global of Y
}
GMX_BARRIER(cr->mpi_comm_mygroup);
if (PAR(cr))
{
if (DOMAINDECOMP(cr))
{
dd_partition_system(fplog,step,cr,TRUE,1,processes
state_global_copy,top_global,ir,
state_copy,&f,mdatoms,top_copy,fr,
vsite,shellfc,constr,
nrnb,wcycle,FALSE);
}
}
//DOING SOMETHING
dd_partition_system(fplog,step,cr,TRUE,1, state_global,top_global,ir,
state,&f,mdatoms,top,fr,vsite,shellfc,constr,
nrnb,wcycle,FALSE);
//I still get error recovered by bExchanged = TRUE
Since is tedious for you correct my code, is there any documentation on the
parameters used by dd_ functions
in order to permit me to use them with the right input?
Francesco
2012/8/15 Mark Abraham <Mark.Abraham at anu.edu.au>
> On 15/08/2012 5:46 AM, francesco oteri wrote:
>
> Dear gromacs users and developers,
> I have a question related to domain decomposition:
>
> I have to tun multiple simulation and every step
> 1) In the sim X the state of simY (and simY need state from sim X)
> 2) getting the potential energy
> 3) continuing
>
>
>
> right now I am testing the point1. In particular I exchange the state
> between simulation X and Y two times:
> the first time this permit at simulation X to get the state of Y ( and
> vicecersa) while the second exchange
> restore the original situatation.
>
>
> Seems you don't actually want to exchange states, but rather do a
> computation on a copy of the other state. I'd either
> 1) do exchange_state on X into a different t_state from the one with which
> X is simulating (so there is no need to exchange back, since doing
> dd_partition_system a second time requires neighbour searching twice and
> that will kill your scaling even harder)
> 2) get Y to do the computation on its coordinates, since swapping the
> result is probably much cheaper than collecting the state, communcating the
> state, doing NS and DD on the state and then computing on it. That might
> mean maintaining multiple t_forcerec or gmx_mtop_t, but at least those data
> structures are likely constant, so you only have to communicate them
> rarely(once?).
>
>
>
> I inserted the following code between lines
>
> if ((repl_ex_nst > 0) && (step > 0) && !bLastStep &&
> do_per_step(step,repl_ex_nst))
> {
>
> and
>
> bExchanged = replica_exchange(fplog, cr, repl_ex, state_global,
> enerd->term, state,step,t);
>
>
> //Performing the first exchange
> if (DOMAINDECOMP(cr))
> {
> dd_collect_state(cr->dd,state,state_global);
>
> if (MASTER(cr))
> {
> exchange_state(cr->ms, Y, state_global);
> }
>
> if (DOMAINDECOMP(cr))
> {
> dd_partition_system(fplog,step,cr,TRUE,1,
>
> state_global,top_global,ir,
>
> state,NULL,mdatoms,top,fr,
> vsite,shellfc,constr,
> nrnb,wcycle,FALSE);
>
> }
>
> //Now every node should have its part of the Y simulation
> //Getting potential energy
>
>
>
> //Performing the second exchange
> if (MASTER(cr))
> {
> exchange_state(cr->ms, Y, state_global); // I don't need to call
> because nothing changed state_global dd_collect_state
> }
>
> if (DOMAINDECOMP(cr))
> {
> dd_partition_system(fplog,step,cr,TRUE,1,
>
> state_global,top_global,ir,
>
> state,NULL,mdatoms,top,fr,
> vsite,shellfc,constr,
> nrnb,wcycle,FALSE);
>
> }
>
> //Now state Y is back to simulation Y
>
>
> The problem is that this simple code gives me problem, in particular it
> gives LINCS problem
> in do_force the step after my code is executed.
>
> Since forcing bNS=TRUE solves the problem, I guess there is some issue
> with neighbor list updating
> but I dont understand why.
>
> I observed that in, after the last dd_partition_system, syste->natoms
> had an other value compared with the value
> it has at the before my code is executed.
>
> What is my error?
>
>
> Particularly with dynamic load balancing, there is no reason that the DD
> for any replica should resemble the DD for any other replica. Each
> processor can have totally different atoms, and a different number of
> atoms, so blindly copying stuff into those data structures will lead to the
> kinds of problems you see. I'd still expect problems even if you disable
> dynamic load balancing. Hence my suggestions above.
>
> The implementation of replica exchange in GROMACS scales poorly because
> exchanging coordinates requires subsequent NS and DD. So I'd encourage you
> to avoid that route if you can. Exchanging Hamiltonians is much cheaper. I
> have an implementation of that for T-REMD, but it won't see the light of
> day any time soon.
>
> Mark
>
> --
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-developers-request at gromacs.org.
>
--
Cordiali saluti, Dr.Oteri Francesco
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20120815/e7b6b4d0/attachment.html>
More information about the gromacs.org_gmx-developers
mailing list