[gmx-developers] slightly odd-looking code

Mark Abraham mark.j.abraham at gmail.com
Tue Mar 24 19:49:59 CET 2020


Hi gmx developers!


On Tue, 24 Mar 2020 at 15:29, jan <rtm443x at googlemail.com> wrote:

> Hi all,
> given what's said below I need to be clear about where I am.
> I'm a back-end dev specialising somewhat in SQL/RDBMSs + data
> management (but not data analysis of any note), with plenty of
> experience of other languages etc. however I have never done any
> x86/x64 as such, I normally use high-level languages, I have no
> experience at all with GPUs and (as already mentioned) my C/C++ is
> neolithic, and I've no experience with the modern c/c++ build tools.
> My linux is not great.  And the nearest I can get to your kind of
> mathematics is a background in classical logic ie. not very close at
> all.
>
> None of these bother me; all are fixable, but it will take time. I'll
> need a minimal amount of guidance to get gowing - "read this" is fine
> but I'll need no handholding, and you don't have time for that anyway.
>

Great - you're coming at the questions from a totally different
perspective, which is healthy for everyone, but that's going to give you a
steep learning curve. There's some useful recorded webinars from BioExcel
given by particularly Paul, Szilard, Carsten, and I over recent years that
are a good starting point for understanding how the code operates at run
time, but you should look for something else for "intro to molecular
dynamics for non-scientists." There's a bunch of material online - has
anybody got suggestions?

First things first, I need to compile this stuff up.
> I'm using windows + cygwin - is this the best environment or do you
> recommend working entirely in Linux?
>

We test in CI on Windows + MSVC, so that works fine, but there has been no
love on cygwin for about a decade. It's almost certainly fixable, but never
been a priority. So go native one way or the other :-)

I've followed the instructions to get it working on cygwin and got
> stuck. Compilation fails with
>
> C:\Program Files\Python38\python.exe: can't open file
>
> '/cygdrive/c/Users/jan/Desktop/gromacs/gromacs-2020.1/admin/createFileHash.py':
> [Errno 2] No such file or directory
> CMake Error at cmake/gmxGenerateVersionInfoRelease.cmake:115 (file):
>   file failed to open for reading (No such file or directory):
>     /cygdrive/c/Users/jan/Desktop/gromacs/gromacs-2020.1/computed_checksum
>
> computed_checksum indeed doesn't exist. What now?
>

Sigh, that's broken implementation of a new feature that I never thought
was worth its cost. Don't know how to fix it.

The above is going by
> <http://manual.gromacs.org/documentation/current/install-guide/index.html>
>
> Other problem is that untarring the tarball works, however on windows
> it's natural to use something like 7-zip. This doesn't work and I lost
> a couple of hours to that (certainly not the fault of the instructions
> but FYI anyway).
>
> Comments below too
>
>
> On 24/03/2020, Mark Abraham <mark.j.abraham at gmail.com> wrote:
> > Hi,
> >
> > Jan, the biggest bang-for-buck optimizations relevant to Folding at Home
> are
> > to
> >
> > a) offer to build them an OpenCL-enabled GROMACS "core" (ie the version
> of
> > GROMACS that they run, when they run GROMACS). Currently they seem to run
> > all GPU jobs with OpenCL and OpenMM, which is nice but leaves a lot of
> > throughput on the table. The GROMACS OpenCL port is mature and stable,
> runs
> > on AMD/NVIDIA/Intel current GPUs, and should present no more driver/user
> > problems than their OpenMM one. Their concept of a GPU slot is a single
> GPU
> > accompanied by a single CPU thread/, whereas the GROMACS OpenCL port
> would
> > prefer multiple dedicated cores. That's still better than leaving GPUs
> > empty if there's not enough OpenMM jobs in the queue, though the actual
> > performance will be woeful compared to GROMACS when you give it a healthy
> > chunk of CPU cores also. Could even be better than OpenMM's GPU core,
> > depending how modern that one is, too ;-) The GROMACS CUDA port is better
> > still (and in 2020 can do a decent job even with only a single CPU core),
> > but they have made a philosophical choice to use OpenCL only.
>
> That has to come later when I get up to speed, but carefully noted, thanks.
>
> > b) update the GROMACS CPU core in F at H because the one used in F at H is
> > several years behind and losing the benefit of the hard optimization work
> > that we've done in the meantime.
>
> Why would f at h not do this already??
>

Limited resources and priorities for it. It's a science-driven project, and
if the people prepared to do the work want to use not-GROMACS for their
science then that is that...

But again, noted.
>
> > c) demonstrate that they can maintainably and usefully offer more than
> two
> > x86 builds of that GROMACS CPU core (GROMACS has lots of SIMD specialized
> > flavours, but F at H only offers SSE4.1 and basic AVX from those flavours,
>
> Yes, thought this might be the case. Definitely worth it for newer chips.
> However, please note that SIMD performance for later chips do not
> always mix well with non-SIMD code and can overall *cost* performance
> <https://blog.cloudflare.com/on-the-dangers-of-intels-frequency-scaling/>
>

Yes thanks, most of us know ;-) Just updating to add AVX2 would give a
clear win.


> > which leaves a lot of performance on the table on recent x86 CPUs. We
> > already have all the logic needed to work out which pre-built GROMACS to
> > download and run, because we use it in containerized GROMACS builds
> also.)
> >
> > Unfortunately they've never open-sourced any of that, so finding out
> where
> > to start is the first challenge. But that way you'll have a lot more
> impact
> > sooner than you will from profiling GROMACS runs after 30 years of
> > optimization. ;-)
>
> I dunno yet. Model tuning is beyond me obviously, but I've seen some
> stuff in the code that I question WRT optimality. However if it's cold
> code or all memory bound then I'll be fixing the wrong thing. Time to
> profile, but need to get it to compile first.
>

Memory? What's that? :-D GROMACS memory usage is typically measured in
megabytes, with sophisticated data-parallelism to keep the working set for
each core down around cache sizes. Obviously you can scale up the problem
to get out of cache, but the problem sizes that suit interesting science
are comparable with the amount of L3 cache you get on a socket these days.

There's a big pile of code in the repo that warrants exhaustive
optimization, and a lot that is used by only a handful of people, which
generally doesn't. It's hard to make a valuable impact in either kind of
place, for different reasons.

Mark

Happy to take this offline and reduce mailing list clutter.
>
> cheers
>
> jan
>
> >
> > Mark
> >
> > On Mon, 23 Mar 2020 at 14:59, jan <rtm443x at googlemail.com> wrote:
> >
> >> Hi,
> >> I'm a general back-end dev.  Given the situation, and folding at home
> >> using gromacs, I thought I'd poke through the code. I noticed
> >> something unexpected, and was advised to email it here. in edsam.cpp,
> >> this:
> >>
> >>
> >> void do_linacc(rvec* xcoll, t_edpar* edi)
> >> {
> >>     /* loop over linacc vectors */
> >>     for (int i = 0; i < edi->vecs.linacc.neig; i++)
> >>     {
> >>         /* calculate the projection */
> >>         real proj = projectx(*edi, xcoll, edi->vecs.linacc.vec[i]);
> >>
> >>
> >>         /* calculate the correction */
> >>         real preFactor = 0.0;
> >>         if (edi->vecs.linacc.stpsz[i] > 0.0)
> >>         {
> >>             if ((proj - edi->vecs.linacc.refproj[i]) < 0.0)
> >>             {
> >>                 preFactor = edi->vecs.linacc.refproj[i] - proj;
> >>             }
> >>         }
> >>         if (edi->vecs.linacc.stpsz[i] < 0.0)
> >>         {
> >>             if ((proj - edi->vecs.linacc.refproj[i]) > 0.0)
> >>             {
> >>                 preFactor = edi->vecs.linacc.refproj[i] - proj;
> >>             }
> >>         }
> >>        [...]
> >>
> >>
> >> In both cases it reaches the same code
> >>
> >>   preFactor = edi->vecs.linacc.refproj[i] - proj
> >>
> >> That surprised me a bit, is it deliberate? If so it may be the code
> >> can be simplified anyway.
> >>
> >> That aside, if you're looking for performance I might be able to help.
> >> I don't know the high level stuff *at this point* and my C++ is so
> >> rusty it creaks, but I can brush that up, do profiling and whatnot.
> >> I'm pretty experience, just not in this area.  Speeding things up is
> >> something I've got a track record of (though I usually have a good
> >> feel for the problem domain first, which I don't here)
> >>
> >> Would it be of some value for me to try getting more speed? If so,
> >> first thing I'd need is to get this running under cygwin, which I'm
> >> struggling with.
> >>
> >> regards
> >>
> >> jan
> >> --
> >> Gromacs Developers mailing list
> >>
> >> * Please search the archive at
> >> http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List before
> >> posting!
> >>
> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>
> >> * For (un)subscribe requests visit
> >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
> >> or send a mail to gmx-developers-request at gromacs.org.
> >>
> >
> --
> Gromacs Developers mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
> or send a mail to gmx-developers-request at gromacs.org.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20200324/3476b7c2/attachment-0003.html>


More information about the gromacs.org_gmx-developers mailing list