[gmx-developers] nbnxn generation

Mark Abraham mark.j.abraham at gmail.com
Fri Aug 22 14:07:06 CEST 2014

On Fri, Aug 22, 2014 at 6:39 AM, Roland Schulz <roland at utk.edu> wrote:

> Hi,
> I vaguely remember that someone said there are some plans of generating
> the verlet kernels in some new way. Do I remember that correctly or do no
> such plan exist? If they do, what are they?

Erik has some plans - the general idea is that we use a python script to
generate flat source files with no conditionality of any sort, on similar
lines to the current group scheme kernel generator.

Erik, you mentioned previously that there are potential problems with
> intrinsics and C++. Do we already have any examples of that?
> If C++ and intrinsics doesn't cause any problems one option to replace the
> preprocessor would be using templated functions (for an example see here:
> http://stackoverflow.com/questions/6179295/if-statement-inside-a-cuda-kernel/6179580#6179580
> - this is for CUDA but doesn't effect the idea).

Yes, that's another way we could metaprogram. I suggested Christian try it
out for combining the FFT grids for better LJPME performance (see draft at
https://gerrit.gromacs.org/#/c/3266/11 in fft5p.cpp for those with access).
Inasmuch as constant-propagation and dead-code optimizations probably work
fine within a function, even when templated, then I think that's an
approach we could use in places. It's definitely better than duplication,
and often more readable than preprocessor-based techniques.

The downside for template meta-programming for non-bonded kernels is that
you still get a kernel with every possible code path in it, whereas with a
generator script doing the metaprogramming, you can read whichever one
suits your current purpose. Being able to see the meta-programming output
can be useful when developing new kernels. I imagine compile time of ~100
kernels might be a little faster overall with the generator script, and
debuggers might have a better time, too.

The downside of generation is having lots of generated code. For the
tarball, we should generate the code, but perhaps for the repo we should do
it at configure time.

On Anton, they metaprogram even harder - the executables for NPT and NVT
are different, for example.


> Roland
> --
> ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
> 865-241-1537, ORNL PO BOX 2008 MS6309
> --
> Gromacs Developers mailing list
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List before
> posting!
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
> or send a mail to gmx-developers-request at gromacs.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20140822/8708acf5/attachment.html>

More information about the gromacs.org_gmx-developers mailing list