[gmx-developers] FLOP accounting in Gromacs 4.6

Szilárd Páll szilard.pall at cbr.su.se
Tue Sep 17 01:51:50 CEST 2013


On Mon, Sep 16, 2013 at 3:45 PM, Jeff Hammond <jeff.science at gmail.com> wrote:
> +1 to Berk's comment.
>
> The fact that doing N-body w/ O(N^2) algorithm is the best way to hit
> peak flop/s immediately suggests this is the wrong metric.
>
> The best (portable) performance metric I've seen for an MD code is
> particle updates per second per core. That's what we use when
> analyzing LAMMPS on BGQ vs x86, etc.

Slightly off-topic, but even the the "per core" metric is a bit
dangerous these days when an AMD core, especially for a floating-point
intensive code, isn't really the same as an Intel core.

--
Szilárd

>
> There are essentially no flops in comm btw (reductions are the
> exception); there one often uses just total time as the figure of
> merit.
> Jeff
>
> Sent from my iPhone
>
> On Sep 16, 2013, at 3:38 AM, Berk Hess <hess at kth.se> wrote:
>
>> Hi,
>>
>> Every time we get such a question, our return question is: why are you asking?
>> Any application should only care about application performance and not about any other measure, such as flops.
>>
>> The flop rate will depend very much on the algorithm, as well as on the hardware.
>> On Sandy/Ivy Bridge, the Verlet PME kernels reach around 50% of peak, even more with GMX_NBNXN_SIMD_4XN set:
>> http://www.sciencedirect.com/science/article/pii/S0010465513001975
>> The FFT probably also get around 50% of peak. But all other code is far less flop intensive. The total I get is around 30% of peak.
>> But you can crank this up by shifting work from PME mesh to pair interactions, using GMX_NBNXN_SIMD_4XN, etc.
>> RF will get lower flop rates, but higher ns/day, etc.
>>
>> So flop numbers are meaningless for most purposes.
>> I think there are only two useful cases: analyzing algorithm performance (but combined with other measures) and convincing people when they can't be convinced that flops are a useless measure. In the latter case we should make sure to maximize the flops by optimizing the settings for that purpose.
>> But I think the flop count is still reasonably accurate (+-10%). Flops in communication should be negligible.
>>
>> Cheers,
>>
>> Berk
>>
>> On 09/16/2013 10:10 AM, Carsten Kutzner wrote:
>>> Hi,
>>>
>>> can I use the Mega-Flops accounting at the end of the md.log file to
>>> calculate how much of the theoretical peak performance of a processor
>>> Gromacs is using? I understand that Flops used in communication are
>>> not counted, so the accounting will give me a lower estimate.
>>>
>>> At what percentage of the theoretical peak performance will Gromacs 4.6
>>> typically run using the Verlet kernels and PME (let's say we have a
>>> big MD system)?
>>>
>>> Do I have to divide the reported Flops by two when running single precision?
>>>
>>> Thanks,
>>>   Carsten
>>
>> --
>> gmx-developers mailing list
>> gmx-developers at gromacs.org
>> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>> Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-developers-request at gromacs.org.
> --
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-developers-request at gromacs.org.



More information about the gromacs.org_gmx-developers mailing list