[gmx-developers] Re: gmx-developers Digest, Vol 87, Issue 3

Pedro Gonnet gonnet at maths.ox.ac.uk
Mon Jul 11 22:34:07 CEST 2011


Hi Berk,

Thanks for the reply!

I still don't really understand what's going on though... My problem is
the following: on a single CPU, the nsgrid_core function requires
roughly 40% more time than on two CPUs.

Using a profiler, I tracked down this difference to the condition

         /* Check if all j's are out of range so we
          * can skip the whole cell.
          * Should save some time, especially with DD.
          */
         if (nrj == 0 ||
             (grida[cgj0]>= max_jcg&&
              (grida[cgj0]>= jcg1 || grida[cgj0+nrj-1]<  jcg0)))
         {
             continue;
         }

being triggered substantially more often in the two-CPU case than in the
single-CPU case. In my understanding, in both cases (two or single CPU),
the same number of cell pairs need to be inspected and hence roughly the
same computational costs should incurred.

How, in this case, do the single-CPU and two-CPU cases differ? In the
single-cell case are particles in cells i and j traversed twice, e.g.
(i,j) and (j,i)?

Many thanks,
Pedro


On Mon, 2011-07-11 at 12:13 +0200, gmx-developers-request at gromacs.org
wrote:
> Date: Mon, 11 Jul 2011 11:26:00 +0200
> From: Berk Hess <hess at cbr.su.se>
> Subject: Re: [gmx-developers] Re: Fairly detailed question regarding
> 	cell	lists in Gromacs in general and nsgrid_core specifically
> To: Discussion list for GROMACS development
> 	<gmx-developers at gromacs.org>
> Message-ID: <4E1AC1A8.1050402 at cbr.su.se>
> Content-Type: text/plain; charset=UTF-8; format=flowed
> 
> Hi,
> 
> This code is for parallel neighbor searching.
> We have to ensure that pairs are not assigned to multiple processes.
> In addition with particle decomposition we want to ensure load balancing.
> With particle decomposition jcg0=icg and jcg1=icg+0.5*#icg, this ensures
> the two above conditions.
> For domain decomposition we use the eighth shell method, which use
> up till 8 zones. Only half of the 8x8 zone pairs should interact.
> For domain decomposition jcg0 and jcg1 are set such that only the wanted
> zone pairs interact (zones are ordered such that only consecutive j-zones
> interact, so a simply check suffices).
> 
> Berk
> 
> On 07/06/2011 10:52 AM, Pedro Gonnet wrote:
> > Hello again,
> >
> > I had another long look at the code and at the older Gromacs papers and
> > realized that the main loop over charge groups starts on line 2058 of
> > ns.c and that the loops in lines 2135, 2151 and 2173 are for the
> > periodic images.
> >
> > I still, however, have no idea what the second condition in lines
> > 2232--2241 of ns.c mean:
> >
> >          /* Check if all j's are out of range so we
> >           * can skip the whole cell.
> >           * Should save some time, especially with DD.
> >           */
> >          if (nrj == 0 ||
> >              (grida[cgj0]>= max_jcg&&
> >               (grida[cgj0]>= jcg1 || grida[cgj0+nrj-1]<  jcg0)))
> >          {
> >              continue;
> >          }
> >
> > Does anybody know what max_jcg, jcg1 and jcg0 are? Or does anybody know
> > where this is documented in detail?
> >
> > Cheers, Pedro
> >
> >
> > On Tue, 2011-07-05 at 16:07 +0100, Pedro Gonnet wrote:
> >> Hi,
> >>
> >> I'm trying to understand how Gromacs builds its neighbor lists and have
> >> been looking, more specifically, at the function nsgrid_core in ns.c.
> >>
> >> If I understand the underlying data organization correctly, the grid
> >> (t_grid) contains an array of cells in which the indices of charge
> >> groups are stored. Pairs of such charge groups are identified and stored
> >> in the neighbor list (put_in_list).
> >>
> >> What I don't really understand is how these pairs are identified.
> >> Usually one would loop over all cells, loop over each charge group
> >> therein, loop over all neighboring cells and store the charge groups
> >> therein which are within the cutoff distance.
> >>
> >> I assume that the first loop, over all cells, is somehow computed with
> >> the for-loops starting at lines 2135, 2151 and 2173 of ns.c. However, I
> >> don't really understand how this is done: What do these loops loop over
> >> exactly?
> >>
> >> In any case, the coordinates of the particle in the outer loop seem to
> >> land in the variables XI, YI and ZI. The inner loop (for-loops starting
> >> in lines 2213, 2216 and 2221 of ns.c) then runs through the neighboring
> >> cells. If I understand correctly, cj is the id of the neighboring cell,
> >> nrj the number of charge groups in that cell and cgj0 the offset of the
> >> charge groups in the data.
> >>
> >> What I don't really understand here are the lines 2232--2241:
> >>
> >>          /* Check if all j's are out of range so we
> >>           * can skip the whole cell.
> >>           * Should save some time, especially with DD.
> >>           */
> >>          if (nrj == 0 ||
> >>              (grida[cgj0]>= max_jcg&&
> >>               (grida[cgj0]>= jcg1 || grida[cgj0+nrj-1]<  jcg0)))
> >>          {
> >>              continue;
> >>          }
> >>
> >> Apparently, some cells can be excluded, but what are the exact criteria?
> >> The test on nrj is somewhat obvious, but what is stored in grid->a?
> >>
> >> There is probably no short answer to my questions, but if anybody could
> >> at least point me to any documentation or description of how the
> >> neighbors are collected in this routine, I would be extremely thankful!
> >>
> >> Cheers, Pedro
> >>
> >>
> >
> 
> 
> 
> ------------------------------
> 





More information about the gromacs.org_gmx-developers mailing list