[gmx-users] pentium III vs Pentium IVs and Intel compiler benchmarks
Erik Lindahl
lindahl at stanford.edu
Tue Mar 26 00:12:37 CET 2002
mark vaughn wrote:
>Greetings,
>
>I was literally filling out a PO for a couple of dual Athlon machines,
>but this P4 talk has made me rethink my plans. I am new to Gromacs and
>MD for the most part but I have leaned a great deal from the discussion
>in this group.
>
Hi Mark,
>
>
>I am interested in the effect of mechanical forces on lipid membranes
>and how those forces affect membrane protein and lipid organization.
>With this in mind, I have been trying to understand how to do MD on
>systems in which surface tension is controlled. From the discussions in
>this group, I am under the impression that this may be one of those
>situations in which double-precision is necessary, or at least
>desirable. As I undertand it, the need for double precision arises
>primarily to keep the fluctations under control. Normally, I would have
>run some tests before I posted, but since the P4/double precision topic
>is hot, I thought I should ask now.
>
>My two questions are:
>1) Is double precision necessary/useful in working with membranes that
>are "stretched" and/or in which surface tension is controlled?
>
In principle "no", although that's of course only my opinion :-) You are
right that the pressure (and tension) fluctuates a lot during a run,
but it doesn't fluctuate less in double precision - it's just
*slightly* different values. The fluctuations should be there on short
scales - it's an effect of pressure definition in small systems!
As long as you perform coupling/scaling over a scale of a couple of
picoseconds the average value won't differ significantly.
>
>
>2) If double precision is required, are the P4 capablities appreciably
>better for Gromacs? I have seen the P4 trounced by Athlon in single
>precision benchmarks, but I do not believe I have seen double precision
>benchmarks.
>
Yes, if you are going to do double precision you should definitely go
with a P4 since we've written SSE2 assembly loops that use the double
precision multimedia instructions on the Pentium4. These are not
available on Athlons.
>
>Intel C compiler:
>Along with others, I have also been curious about the the Intel compiler
>vs gcc, so I ran some simple benchmarks using the cpeptide demo. For an
>old dual PPro machine the Intel compiler made some difference (see
>below) for a new dual PIII machine there was little difference. The
>results are for a single-precision sumulation using cutoffs (I did not
>want to recompile fftw with the icc compiler). Recompiling Gromacs with
>icc was trivial (thanks Gromacs team!)
>
>for PPro 192M Ram, 233Mhz 2.4.18SMP kernel (no assembler loops)
>
>About 6% increase in performance for intel c compiler over gcc 3.0.4
>About 14% increase in performance for intel c compiler over gcc 2.96
>About 8% increase in performance for gcc 3.0.4 over gcc 2.96
>
>For PIII 1G Ram, 1000 Mhz, 2.4.8SMP kernel (assembler loops)
>
>About 4% increase in performance for intel c compiler over gcc 2.96
>
>Note: performance means ps simulation/hr cpu time.
>
Thanks, it's interesting to have the figures for the discussion :-)
The reason we don't see much speedup with the assembly loops is that
both compilers translate them line by line into machine code without
performing any optimization. I don't think there is much to gain anyway,
since modern CPUs are very efficient at dynamic runtime instruction
scheduling. I spend a couple of days trying to improve the scheduling,
but it really doesn't make any difference.
Cheers,
Erik
More information about the gromacs.org_gmx-users
mailing list