[gmx-users] Simulation time losses with REMD
Mark Abraham
Mark.Abraham at anu.edu.au
Sat Jan 29 09:23:30 CET 2011
On 28/01/2011 4:46 PM, Mark Abraham wrote:
> Hi,
>
> I compared the .log file time accounting for same .tpr file run alone
> in serial or as part of an REMD simulation (with each replica on a
> single proessor). It ran about 5-10% slower in the latter. The effect
> was a bit larger when comparing the same .tpr on 8 processors with
> REMD with 8 processers per replica. The effect seems fairly
> independent of whether I compare the lowest or highest replica.
OK I found the issue by binary-searching the code looking for the
offending line. It's in compute_globals() in src/kernel/md.c. The call
to gmx_sum_sim consumes all the extra time. This code is taking care of
synchronization for possibly doing checkpointing.
if (MULTISIM(cr) && bInterSimGS)
{
if (MASTER(cr))
{
/* Communicate the signals between the
simulations */
gmx_sum_sim(eglsNR,gs_buf,cr->ms);
}
/* Communicate the signals form the master to the
others */
gmx_bcast(eglsNR*sizeof(gs_buf[0]),gs_buf,cr);
}
This eventually calls
void gmx_sumf_comm(int nr,float r[],MPI_Comm mpi_comm)
{
#if defined(MPI_IN_PLACE_EXISTS) || defined(GMX_THREADS)
MPI_Allreduce(MPI_IN_PLACE,r,nr,MPI_FLOAT,MPI_SUM,mpi_comm);
#else
/* this function is only used in code that is not performance
critical,
(during setup, when comm_rec is not the appropriate communication
structure), so this isn't as bad as it looks. */
float *buf;
int i;
snew(buf, nr);
MPI_Allreduce(r,buf,nr,MPI_FLOAT,MPI_SUM,mpi_comm);
for(i=0; i<nr; i++)
r[i] = buf[i];
sfree(buf);
#endif
}
Clearly the comment is out of date. My nstlist=5, repl_ex_nst=2500 and
nstcalcenergy=-1, so that triggers gs.nstms=5 and so bInterSimGS is TRUE
every 5 steps. I'm not sure whether the problem is with nstlist, or the
multi-simulation checkpointing engineering, or what.
Mark
More information about the gromacs.org_gmx-users
mailing list