[gmx-developers] FLOP accounting in Gromacs 4.6
szilard.pall at cbr.su.se
Tue Sep 17 01:51:50 CEST 2013
On Mon, Sep 16, 2013 at 3:45 PM, Jeff Hammond <jeff.science at gmail.com> wrote:
> +1 to Berk's comment.
> The fact that doing N-body w/ O(N^2) algorithm is the best way to hit
> peak flop/s immediately suggests this is the wrong metric.
> The best (portable) performance metric I've seen for an MD code is
> particle updates per second per core. That's what we use when
> analyzing LAMMPS on BGQ vs x86, etc.
Slightly off-topic, but even the the "per core" metric is a bit
dangerous these days when an AMD core, especially for a floating-point
intensive code, isn't really the same as an Intel core.
> There are essentially no flops in comm btw (reductions are the
> exception); there one often uses just total time as the figure of
> Sent from my iPhone
> On Sep 16, 2013, at 3:38 AM, Berk Hess <hess at kth.se> wrote:
>> Every time we get such a question, our return question is: why are you asking?
>> Any application should only care about application performance and not about any other measure, such as flops.
>> The flop rate will depend very much on the algorithm, as well as on the hardware.
>> On Sandy/Ivy Bridge, the Verlet PME kernels reach around 50% of peak, even more with GMX_NBNXN_SIMD_4XN set:
>> The FFT probably also get around 50% of peak. But all other code is far less flop intensive. The total I get is around 30% of peak.
>> But you can crank this up by shifting work from PME mesh to pair interactions, using GMX_NBNXN_SIMD_4XN, etc.
>> RF will get lower flop rates, but higher ns/day, etc.
>> So flop numbers are meaningless for most purposes.
>> I think there are only two useful cases: analyzing algorithm performance (but combined with other measures) and convincing people when they can't be convinced that flops are a useless measure. In the latter case we should make sure to maximize the flops by optimizing the settings for that purpose.
>> But I think the flop count is still reasonably accurate (+-10%). Flops in communication should be negligible.
>> On 09/16/2013 10:10 AM, Carsten Kutzner wrote:
>>> can I use the Mega-Flops accounting at the end of the md.log file to
>>> calculate how much of the theoretical peak performance of a processor
>>> Gromacs is using? I understand that Flops used in communication are
>>> not counted, so the accounting will give me a lower estimate.
>>> At what percentage of the theoretical peak performance will Gromacs 4.6
>>> typically run using the Verlet kernels and PME (let's say we have a
>>> big MD system)?
>>> Do I have to divide the reported Flops by two when running single precision?
>> gmx-developers mailing list
>> gmx-developers at gromacs.org
>> Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-developers-request at gromacs.org.
> gmx-developers mailing list
> gmx-developers at gromacs.org
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-developers-request at gromacs.org.
More information about the gromacs.org_gmx-developers