[gmx-users] pentium III vs Pentium IVs and Intel compiler benchmarks

Erik Lindahl lindahl at stanford.edu
Tue Mar 26 00:12:37 CET 2002

mark vaughn wrote:

>I was literally filling out a PO for a couple of dual Athlon machines,
>but this P4 talk has made me rethink my plans.  I am new to Gromacs and
>MD for the most part but I have leaned a great deal from the discussion
>in this group.
Hi Mark,

>I am interested in the effect of mechanical forces on lipid membranes
>and how those forces affect membrane protein and lipid organization.
>With this in mind, I have been trying to understand how to do MD on
>systems in which surface tension is controlled.  From the discussions in
>this group, I am under the impression that this may be one of those
>situations in which double-precision is necessary, or at least
>desirable. As I undertand it, the need for double precision arises
>primarily to keep the fluctations under control.  Normally, I would have
>run some tests before I posted, but since the P4/double precision topic
>is hot, I thought I should ask now.
>My two questions are:
>1) Is double precision necessary/useful in working with membranes that
>are "stretched" and/or in which surface tension is controlled?

In principle "no", although that's of course only my opinion :-) You are 
right that the pressure (and tension) fluctuates a lot during a run, 
 but it doesn't fluctuate less in double precision - it's just 
*slightly* different values. The fluctuations should be there on short 
scales - it's an effect of pressure definition in small systems!

As long as you perform coupling/scaling over a scale of a couple of 
picoseconds the average value won't differ significantly.

>2) If double precision is required, are the P4 capablities appreciably
>better for Gromacs?  I have seen the P4 trounced by Athlon in single
>precision benchmarks, but I do not believe I have seen double precision
Yes, if you are going to do double precision you should definitely go 
with a P4 since we've written SSE2 assembly loops that use the double 
precision multimedia instructions on the Pentium4. These are not 
available on Athlons.

>Intel C compiler:
>Along with others, I have also been curious about the the Intel compiler
>vs gcc, so I ran some simple benchmarks using the cpeptide demo.  For an
>old dual PPro machine the Intel compiler made some difference (see
>below) for a new dual PIII machine there was little difference. The
>results are for a single-precision sumulation using cutoffs (I did not
>want to recompile fftw with the icc compiler). Recompiling Gromacs with
>icc was trivial (thanks Gromacs team!)
>for PPro 192M Ram, 233Mhz 2.4.18SMP kernel (no assembler loops)
>About 6% increase in performance for  intel c compiler over gcc 3.0.4
>About 14% increase in performance for  intel c compiler over gcc 2.96
>About 8% increase in performance for gcc 3.0.4 over gcc 2.96
>For PIII 1G Ram, 1000 Mhz, 2.4.8SMP kernel (assembler loops)
>About 4% increase in performance for intel c compiler over gcc 2.96
>Note: performance means ps simulation/hr cpu time. 

Thanks, it's interesting to have the figures for the discussion :-)

The reason we don't see much speedup with the assembly loops is that 
both compilers translate them line by line into machine code without 
performing any optimization. I don't think there is much to gain anyway, 
since modern CPUs are very efficient at dynamic runtime instruction 
scheduling. I spend a couple of days trying to improve the scheduling, 
but it really doesn't make any difference.



More information about the gromacs.org_gmx-users mailing list