[gmx-developers] slightly odd-looking code
Paul Bauer
paul.bauer.q at gmail.com
Tue Mar 24 19:54:09 CET 2020
Hello,
I agree with Mark that the webinars should be a good start to have an idea
a out the code.
Concerning the error you are getting, this shouldn't happen if you work and
build from a git repository. But it is still something I think should be
fixed (especially because I have been the one pushing for it against Mark's
objections)
Cheers
Paul
On Tue, 24 Mar 2020, 19:50 Mark Abraham, <mark.j.abraham at gmail.com> wrote:
> Hi gmx developers!
>
>
> On Tue, 24 Mar 2020 at 15:29, jan <rtm443x at googlemail.com> wrote:
>
>> Hi all,
>> given what's said below I need to be clear about where I am.
>> I'm a back-end dev specialising somewhat in SQL/RDBMSs + data
>> management (but not data analysis of any note), with plenty of
>> experience of other languages etc. however I have never done any
>> x86/x64 as such, I normally use high-level languages, I have no
>> experience at all with GPUs and (as already mentioned) my C/C++ is
>> neolithic, and I've no experience with the modern c/c++ build tools.
>> My linux is not great. And the nearest I can get to your kind of
>> mathematics is a background in classical logic ie. not very close at
>> all.
>>
>> None of these bother me; all are fixable, but it will take time. I'll
>> need a minimal amount of guidance to get gowing - "read this" is fine
>> but I'll need no handholding, and you don't have time for that anyway.
>>
>
> Great - you're coming at the questions from a totally different
> perspective, which is healthy for everyone, but that's going to give you a
> steep learning curve. There's some useful recorded webinars from BioExcel
> given by particularly Paul, Szilard, Carsten, and I over recent years that
> are a good starting point for understanding how the code operates at run
> time, but you should look for something else for "intro to molecular
> dynamics for non-scientists." There's a bunch of material online - has
> anybody got suggestions?
>
> First things first, I need to compile this stuff up.
>> I'm using windows + cygwin - is this the best environment or do you
>> recommend working entirely in Linux?
>>
>
> We test in CI on Windows + MSVC, so that works fine, but there has been no
> love on cygwin for about a decade. It's almost certainly fixable, but never
> been a priority. So go native one way or the other :-)
>
> I've followed the instructions to get it working on cygwin and got
>> stuck. Compilation fails with
>>
>> C:\Program Files\Python38\python.exe: can't open file
>>
>> '/cygdrive/c/Users/jan/Desktop/gromacs/gromacs-2020.1/admin/createFileHash.py':
>> [Errno 2] No such file or directory
>> CMake Error at cmake/gmxGenerateVersionInfoRelease.cmake:115 (file):
>> file failed to open for reading (No such file or directory):
>> /cygdrive/c/Users/jan/Desktop/gromacs/gromacs-2020.1/computed_checksum
>>
>> computed_checksum indeed doesn't exist. What now?
>>
>
> Sigh, that's broken implementation of a new feature that I never thought
> was worth its cost. Don't know how to fix it.
>
> The above is going by
>> <http://manual.gromacs.org/documentation/current/install-guide/index.html
>> >
>>
>> Other problem is that untarring the tarball works, however on windows
>> it's natural to use something like 7-zip. This doesn't work and I lost
>> a couple of hours to that (certainly not the fault of the instructions
>> but FYI anyway).
>>
>> Comments below too
>>
>>
>> On 24/03/2020, Mark Abraham <mark.j.abraham at gmail.com> wrote:
>> > Hi,
>> >
>> > Jan, the biggest bang-for-buck optimizations relevant to Folding at Home
>> are
>> > to
>> >
>> > a) offer to build them an OpenCL-enabled GROMACS "core" (ie the version
>> of
>> > GROMACS that they run, when they run GROMACS). Currently they seem to
>> run
>> > all GPU jobs with OpenCL and OpenMM, which is nice but leaves a lot of
>> > throughput on the table. The GROMACS OpenCL port is mature and stable,
>> runs
>> > on AMD/NVIDIA/Intel current GPUs, and should present no more driver/user
>> > problems than their OpenMM one. Their concept of a GPU slot is a single
>> GPU
>> > accompanied by a single CPU thread/, whereas the GROMACS OpenCL port
>> would
>> > prefer multiple dedicated cores. That's still better than leaving GPUs
>> > empty if there's not enough OpenMM jobs in the queue, though the actual
>> > performance will be woeful compared to GROMACS when you give it a
>> healthy
>> > chunk of CPU cores also. Could even be better than OpenMM's GPU core,
>> > depending how modern that one is, too ;-) The GROMACS CUDA port is
>> better
>> > still (and in 2020 can do a decent job even with only a single CPU
>> core),
>> > but they have made a philosophical choice to use OpenCL only.
>>
>> That has to come later when I get up to speed, but carefully noted,
>> thanks.
>>
>> > b) update the GROMACS CPU core in F at H because the one used in F at H is
>> > several years behind and losing the benefit of the hard optimization
>> work
>> > that we've done in the meantime.
>>
>> Why would f at h not do this already??
>>
>
> Limited resources and priorities for it. It's a science-driven project,
> and if the people prepared to do the work want to use not-GROMACS for their
> science then that is that...
>
> But again, noted.
>>
>> > c) demonstrate that they can maintainably and usefully offer more than
>> two
>> > x86 builds of that GROMACS CPU core (GROMACS has lots of SIMD
>> specialized
>> > flavours, but F at H only offers SSE4.1 and basic AVX from those flavours,
>>
>> Yes, thought this might be the case. Definitely worth it for newer chips.
>> However, please note that SIMD performance for later chips do not
>> always mix well with non-SIMD code and can overall *cost* performance
>> <https://blog.cloudflare.com/on-the-dangers-of-intels-frequency-scaling/>
>>
>
> Yes thanks, most of us know ;-) Just updating to add AVX2 would give a
> clear win.
>
>
>> > which leaves a lot of performance on the table on recent x86 CPUs. We
>> > already have all the logic needed to work out which pre-built GROMACS to
>> > download and run, because we use it in containerized GROMACS builds
>> also.)
>> >
>> > Unfortunately they've never open-sourced any of that, so finding out
>> where
>> > to start is the first challenge. But that way you'll have a lot more
>> impact
>> > sooner than you will from profiling GROMACS runs after 30 years of
>> > optimization. ;-)
>>
>> I dunno yet. Model tuning is beyond me obviously, but I've seen some
>> stuff in the code that I question WRT optimality. However if it's cold
>> code or all memory bound then I'll be fixing the wrong thing. Time to
>> profile, but need to get it to compile first.
>>
>
> Memory? What's that? :-D GROMACS memory usage is typically measured in
> megabytes, with sophisticated data-parallelism to keep the working set for
> each core down around cache sizes. Obviously you can scale up the problem
> to get out of cache, but the problem sizes that suit interesting science
> are comparable with the amount of L3 cache you get on a socket these days.
>
> There's a big pile of code in the repo that warrants exhaustive
> optimization, and a lot that is used by only a handful of people, which
> generally doesn't. It's hard to make a valuable impact in either kind of
> place, for different reasons.
>
> Mark
>
> Happy to take this offline and reduce mailing list clutter.
>>
>> cheers
>>
>> jan
>>
>> >
>> > Mark
>> >
>> > On Mon, 23 Mar 2020 at 14:59, jan <rtm443x at googlemail.com> wrote:
>> >
>> >> Hi,
>> >> I'm a general back-end dev. Given the situation, and folding at home
>> >> using gromacs, I thought I'd poke through the code. I noticed
>> >> something unexpected, and was advised to email it here. in edsam.cpp,
>> >> this:
>> >>
>> >>
>> >> void do_linacc(rvec* xcoll, t_edpar* edi)
>> >> {
>> >> /* loop over linacc vectors */
>> >> for (int i = 0; i < edi->vecs.linacc.neig; i++)
>> >> {
>> >> /* calculate the projection */
>> >> real proj = projectx(*edi, xcoll, edi->vecs.linacc.vec[i]);
>> >>
>> >>
>> >> /* calculate the correction */
>> >> real preFactor = 0.0;
>> >> if (edi->vecs.linacc.stpsz[i] > 0.0)
>> >> {
>> >> if ((proj - edi->vecs.linacc.refproj[i]) < 0.0)
>> >> {
>> >> preFactor = edi->vecs.linacc.refproj[i] - proj;
>> >> }
>> >> }
>> >> if (edi->vecs.linacc.stpsz[i] < 0.0)
>> >> {
>> >> if ((proj - edi->vecs.linacc.refproj[i]) > 0.0)
>> >> {
>> >> preFactor = edi->vecs.linacc.refproj[i] - proj;
>> >> }
>> >> }
>> >> [...]
>> >>
>> >>
>> >> In both cases it reaches the same code
>> >>
>> >> preFactor = edi->vecs.linacc.refproj[i] - proj
>> >>
>> >> That surprised me a bit, is it deliberate? If so it may be the code
>> >> can be simplified anyway.
>> >>
>> >> That aside, if you're looking for performance I might be able to help.
>> >> I don't know the high level stuff *at this point* and my C++ is so
>> >> rusty it creaks, but I can brush that up, do profiling and whatnot.
>> >> I'm pretty experience, just not in this area. Speeding things up is
>> >> something I've got a track record of (though I usually have a good
>> >> feel for the problem domain first, which I don't here)
>> >>
>> >> Would it be of some value for me to try getting more speed? If so,
>> >> first thing I'd need is to get this running under cygwin, which I'm
>> >> struggling with.
>> >>
>> >> regards
>> >>
>> >> jan
>> >> --
>> >> Gromacs Developers mailing list
>> >>
>> >> * Please search the archive at
>> >> http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List
>> before
>> >> posting!
>> >>
>> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>> >>
>> >> * For (un)subscribe requests visit
>> >>
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
>> >> or send a mail to gmx-developers-request at gromacs.org.
>> >>
>> >
>> --
>> Gromacs Developers mailing list
>>
>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List before
>> posting!
>>
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
>> or send a mail to gmx-developers-request at gromacs.org.
>>
> --
> Gromacs Developers mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
> or send a mail to gmx-developers-request at gromacs.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20200324/30a15d31/attachment-0001.html>
More information about the gromacs.org_gmx-developers
mailing list