[gmx-developers] std::vector<Rvec> slowness

Berk Hess hess at kth.se
Wed Mar 23 14:46:32 CET 2016


On 2016-03-23 14:18, Mark Abraham wrote:
> Hi,
>
> On Wed, Mar 23, 2016 at 1:56 PM Berk Hess <hess at kth.se 
> <mailto:hess at kth.se>> wrote:
>
>     Hi,
>
>     It's impractical and bad for performance to go from AoS to SoA.
>
>
> Sure, but that's what we do every MD step on typical CPUs. We do DD 
> communication on XYZXYZXYZXYZ and then transpose them to XXXXYYYYZZZZ 
> to use in short-ranged kernels. If we did DD communication on 
> XXXXX....YYYYY..... ZZZZZZ.... then we still have gather-scatter 
> overhead, but we don't have transposition in the mix.
>
>     The question is now if we can fix this issue or if we should
>     always try to use .data() in inner-loops
>
>
> I think you're missing my point... we've always done a cast from rvec 
> * to real * to pass to inner loops. :-) Using data() is a similar idea 
> - hence my suggestion of as_real_array(theVectorOfRVecs).
No. The bonded, vsite and constraint code uses rvec and we'd like to 
keep this in most cases to use rvec operations.
>
>     or if we should completely avoid std::vector<RVec> in performance
>     sensitive code and use good old C pointers with nasty manual
>     allocation.
>
>
> Allocation is irrelevant to how one uses it. One must have a real * to 
> be sure the compiler isn't confused. To get that from a vector, we use 
> data(). Whether the allocation of the memory is managed by std::vector 
> or snew is immaterial.
That's true. But we need a way of avoiding coders from using [] on 
std::vector<RVec> in performance sensitive code. Or we could consider 
not allowing that anywhere, which is at least possible (I just discussed 
this with Mark).

Berk
>
> Mark
>
>     Cheers,
>
>     Berk
>
>
>     On 2016-03-23 13:43, Mark Abraham wrote:
>>     Hi,
>>
>>     RVec was intended as a convenience feature for people writing
>>     high-level, rather than low-level code. I would indeed not
>>     advocate using it in performance-sensitive mdrun code without
>>     checking for such performance effects.
>>
>>     I think the real issue is closer to that the use of *rvec*
>>     generally confuses compilers because annotations about pointers
>>     do indeed get lost, and perhaps [3] in real [3][] is not as close
>>     to being part of the type in the compiler implementation as we
>>     might expect. (For a long time, we've used very little rvec close
>>     to the short-ranged kernels. The first thing we do with the rvec
>>     * args to the SIMD bondeds is to cast them to real *. I think
>>     this is just as hard to police as Berk's concern about not using
>>     std::vector<RVec>::operator[] naively - the problem isn't the
>>     "fault" of std::vector or its use.)
>>
>>     The use of rvec * engenders some transposing that might be
>>     avoided if we had SoA rather than AoS layout in mdrun. A single
>>     rvec is likely to have all three coordinates on the same cache
>>     line, but if one is anyway often using nearby atoms the caching
>>     behaviour of three separate real * arrays should be rather
>>     similar. One still tends to have gather/scatter overhead, however.
>>
>>     Mark
>>
>>
>>     On Wed, Mar 23, 2016 at 11:24 AM Berk Hess <hess at kth.se
>>     <mailto:hess at kth.se>> wrote:
>>
>>         No.
>>         using the RVec * pointer obtained through .data() already
>>         fixes the issue. We don't even need as_rvec_array().
>>
>>         The real issue is that it's near impossible to enforce this
>>         and detect vector access of RVecs if we start using
>>         std::vector everywhere through the code.
>>
>>         Cheers,
>>
>>         Berk
>>
>>
>>         On 2016-03-23 11:19, Mark Abraham wrote:
>>>         Hi,
>>>
>>>         If so, then we might want a helper function like
>>>         as_rvec_array, but as_real_array instead.
>>>
>>>         Mark
>>>
>>>         On Wed, Mar 23, 2016 at 10:51 AM Berk Hess <hess at kth.se
>>>         <mailto:hess at kth.se>> wrote:
>>>
>>>             On 2016-03-23 10:50, Erik Lindahl wrote:
>>>>             Hi,
>>>>
>>>>             I haven’t followed the discussion in detail, but a long
>>>>             time ago I remember having simliar issues in the
>>>>             kernels when using a list[] of rvec[] (in plain-old-c,
>>>>             no c++) instead of extracting the pointer and handling
>>>>             the multiply-by-3 manually. Could it be something
>>>>             similar here, e.g. that the compiler things it does not
>>>>             know enough about RVec rather than something going from
>>>>             with the outer list?
>>>             That would be my guess. The index used in the same loop
>>>             comes from a vector as well and doesn't seem to affect
>>>             performance.
>>>
>>>
>>>             Berk
>>>
>>>>
>>>>             Cheers,
>>>>
>>>>             Erik
>>>>
>>>>>             On 23 Mar 2016, at 10:44, Berk Hess <hess at kth.se
>>>>>             <mailto:hess at kth.se>> wrote:
>>>>>
>>>>>             On 2016-03-23 10:42, Mark Abraham wrote:
>>>>>>             Hi,
>>>>>>
>>>>>>             On Wed, Mar 23, 2016 at 9:44 AM Berk Hess
>>>>>>             <hess at kth.se <mailto:hess at kth.se>> wrote:
>>>>>>
>>>>>>                 Hi,
>>>>>>
>>>>>>                 Luckily Szilard does thorough testing and noticed
>>>>>>                 a performance
>>>>>>                 degradation in change set 25 of
>>>>>>                 https://gerrit.gromacs.org/#/c/5232/ The
>>>>>>                 only signifcant change with respect to previous
>>>>>>                 sets is replacing C
>>>>>>                 pointers by std::vector. I traced the performance
>>>>>>                 difference back to a
>>>>>>                 single loop, which must have become several
>>>>>>                 factors slower to explain
>>>>>>                 the time difference. I get the performance back
>>>>>>                 when replacing the
>>>>>>                 vector by a pointer extracted with .data(), see
>>>>>>                 below. I looked at the
>>>>>>                 assembly code from gcc 5.3.1 and the vector case
>>>>>>                 generated 200 extra
>>>>>>                 instructions, which makes it difficult to see
>>>>>>                 what the actual difference
>>>>>>                 is. The pointer case uses a lot of vmovss and
>>>>>>                 vaddss, which the vector
>>>>>>                 one does much less, but this is only serial SIMD
>>>>>>                 instruction. I thought
>>>>>>                 that [] in vector might does bounds checking,
>>>>>>
>>>>>>
>>>>>>             Definitely not in release builds.
>>>>>>
>>>>>>                 but it seems it does not.
>>>>>>                 Can anyone explain why the vector case can be so
>>>>>>                 slow?
>>>>>>
>>>>>>                 If this is a general issue (with RVec or more?),
>>>>>>                 we need to always extra
>>>>>>                 a pointer with .data() for use in all
>>>>>>                 inner-loops. This is pretty
>>>>>>                 annoying and difficult to enforce.
>>>>>>
>>>>>>                 Cheers,
>>>>>>
>>>>>>                 Berk
>>>>>>
>>>>>>                   const std::vector<RVec> f_foreign =
>>>>>>                 idt_foreign->force
>>>>>>
>>>>>>
>>>>>>             This does a copy of the vector, and doesn't seem to
>>>>>>             be in any version of this patch in gerrit. Is this
>>>>>>             what you meant to write?
>>>>>             I tried this. But my original "vectorized" patch set
>>>>>             took a pointer from idt_foreign and did not copy the
>>>>>             vector, that gives the same, slow, performance.
>>>>>
>>>>>             Berk
>>>>>>
>>>>>>             Mark
>>>>>>
>>>>>>                 or
>>>>>>                   const RVec      *f_foreign   =
>>>>>>                 idt_foreign->force.data();
>>>>>>
>>>>>>                    int natom = atomList->atom.size();
>>>>>>                    for (int i = 0; i < natom; i++)
>>>>>>                    {
>>>>>>                        int ind = atomList->atom[i];
>>>>>>                  rvec_inc(f[ind], f_foreign[ind]);
>>>>>>                    }
>>>>>>                 --
>>>>>>                 Gromacs Developers mailing list
>>>>>>
>>>>>>                 * Please search the archive at
>>>>>>                 http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List
>>>>>>                 before posting!
>>>>>>
>>>>>>                 * Can't post? Read
>>>>>>                 http://www.gromacs.org/Support/Mailing_Lists
>>>>>>
>>>>>>                 * For (un)subscribe requests visit
>>>>>>                 https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
>>>>>>                 or send a mail to
>>>>>>                 gmx-developers-request at gromacs.org
>>>>>>                 <mailto:gmx-developers-request at gromacs.org>.
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>             -- 
>>>>>             Gromacs Developers mailing list
>>>>>
>>>>>             * Please search the archive at
>>>>>             http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List
>>>>>             before posting!
>>>>>
>>>>>             * Can't post? Read
>>>>>             http://www.gromacs.org/Support/Mailing_Lists
>>>>>
>>>>>             * For (un)subscribe requests visit
>>>>>             https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
>>>>>             or send a mail to gmx-developers-request at gromacs.org
>>>>>             <mailto:gmx-developers-request at gromacs.org>.
>>>>
>>>>
>>>>
>>>
>>>             --
>>>             Gromacs Developers mailing list
>>>
>>>             * Please search the archive at
>>>             http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List
>>>             before posting!
>>>
>>>             * Can't post? Read
>>>             http://www.gromacs.org/Support/Mailing_Lists
>>>
>>>             * For (un)subscribe requests visit
>>>             https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
>>>             or send a mail to gmx-developers-request at gromacs.org
>>>             <mailto:gmx-developers-request at gromacs.org>.
>>>
>>>
>>>
>>
>>         --
>>         Gromacs Developers mailing list
>>
>>         * Please search the archive at
>>         http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List
>>         before posting!
>>
>>         * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>>         * For (un)subscribe requests visit
>>         https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
>>         or send a mail to gmx-developers-request at gromacs.org
>>         <mailto:gmx-developers-request at gromacs.org>.
>>
>>
>>
>
>     --
>     Gromacs Developers mailing list
>
>     * Please search the archive at
>     http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List
>     before posting!
>
>     * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
>     * For (un)subscribe requests visit
>     https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
>     or send a mail to gmx-developers-request at gromacs.org
>     <mailto:gmx-developers-request at gromacs.org>.
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20160323/401a47a9/attachment-0001.html>


More information about the gromacs.org_gmx-developers mailing list