[gmx-users] Opteron: compile with 32-bit SSE

Erik Lindahl lindahl at sbc.su.se
Mon Sep 13 14:58:41 CEST 2004


Hi,

I'm pretty close to moving the 64-bit SSE loops into the main 
distribution, but I've got a handful more tricks to try.

First: 64 vs 32 bit assembly actually differs a bit, since the pointers 
must be 64- bit and the function call sequence is completely different. 
All this is solved (and works), but I've wanted to hold of the final 
testing with the rest of the code until everything is frozen so I don't 
have to repeat it several times...

Second: The Opteron is just as fast at running 32-bit code. 
Theoretically the additional registers in 64-bit mode could improve 
things, but in my testing I've found the automatic register renaming to 
do such a good job that the difference is negligible in practice. In 
other words: the SSE option will work automatic for Opterons in 64-bit 
mode very soon, but don't expect any magical speedup compared to 32-bit 
mode!

Cheers,

Erik

On Sep 10, 2004, at 9:35 AM, David van der Spoel wrote:

> On Fri, 2004-09-10 at 03:27, Amadeu Sum wrote:
>> Dear Choon-Peng,
>>
>> Thank you for your comments. I had already done exactly what David and
>> you suggested, and now I am facing also the same problem as you, that
>> is, we have a specific interconnect for the Opteron cluster and it has
>> to be compiled with specific libraries only available in the Opteron.
>>
>> Anyone else with this dilemma? Or will we have to wait until the
>> adjusting Erik is doing for the 64-bit version?
>
> I think it is actually unrelated to 64 bit extensions. It should be
> possible to use SSE even if you compile in 64 bit. The only problem I
> can think of is the use of 64 bit integers iso 32 bit.
>
> I'll try to compile on an AMD64 OS, and report back later.
>>
>>                          Amadeu
>>
>>
>> On Fri, 10 Sep 2004 08:44:56 +0800, Chng Choon Peng
>> <cpchng at bii.a-star.edu.sg> wrote:
>>> Dear Amadeu,
>>>
>>>   I'm also facing this problem and followed the suggestion from 
>>> David to
>>> compile GROMACS on a Pentium system and then transfer the binary 
>>> across.
>>> It worked for me and I would like to share my experiences.
>>>
>>> For GCC (I used default 2.96), you have to compile with the 
>>> additional flags
>>> "--without-x" and "--without-xml". When compiling on the Opteron 
>>> (yes, no
>>> SSE/SSE2), I have to include these 2 flags as well.
>>> You can just use the default set of flags.
>>>
>>> For Intel C Compilers:
>>> export CFLAGS="-O3 -static -static-libcxa"
>>> And I believe you have to compile FFTw 2.1.x in static mode as well.
>>>
>>> In general, ICC compiled SSE code performs better than GCC.
>>> And of course with SSE/SSE2, performance is better than without.
>>> I collected some benchmark numbers on my homepage that might be of
>>> reference:
>>> http://web.bii.a-star.edu.sg/~cpchng/Gromacs_benchmarks.html
>>>
>>> I used bold to highlight the best performance for single-precision 
>>> GROMACS
>>> across platforms and purple for double-precision.
>>> The Opteron 2.2GHz we have is the best platform in almost all gmx
>>> benchmarks.
>>>
>>> However, I still wish for the compilation problem with SSE to be 
>>> resolved,
>>> as we will be using an interconnect that does not exist on the 
>>> Pentium
>>> cluster. So, will not be able to compile an optimal parallel version 
>>> for the
>>> Opterons.
>>>
>>> cheers,
>>> Choon-Peng
>>> --
>>> Mr. Choon-Peng CHNG
>>> Research Associate
>>> Computational Biology Group
>>> BioInformatics Institute, BMSI, A*STAR
>>> 30 Biopolis Street
>>> #07-01 Matrix Building
>>> Singapore 138671
>>> Tel (O): +65 64788301 Fax (O): +65 64789047
>>> www.bii.a-star.edu.sg/~cpchng
>>>
>>>
>>>
>>>
>>> On 9/10/04 5:09 AM, "David" <spoel at xray.bmc.uu.se> wrote:
>>>
>>>> On Thu, 2004-09-09 at 22:26, Amadeu Sum wrote:
>>>>> I would like to ask the help from those who are using Gromacs in 
>>>>> the
>>>>> Opteron system.
>>>>>
>>>>> I have read many of the posting related to the performance issue 
>>>>> and
>>>>> it seems that Gromacs compiled in 32-bit with SSE works better in 
>>>>> the
>>>>> Opteron.  Can anyone point out what are the options/flags they are
>>>>> using to get Gromacs to compile in an Opteron system in 32-bit with
>>>>> SSE? I can get it to compile in 32-bit, but I have been unable to 
>>>>> get
>>>>> SSE. I am specifying:
>>>>> CPFLAGS = -O6 -fomit -fomit-frame-pointer -finline-functions -Wall
>>>>> -Who-unused -malign-double -funroll-all-loops -m32 -msse -msse2
>>>>> -mfpmath=sse
>>>>>
>>>>> I have also tried explicitly specifying "--enable-x86-asm" and
>>>>> "--disable-cpu-optimization" in configure, but that has not helped.
>>>>>
>>>>> I am using gcc v3.2.3
>>>>>
>>>> Do you have a normal pentium somewhere?
>>>> Then your best bet is to compile there with the intel compilers---
>>>>
>>>>> Thanks in advance,
>>>>>
>>>>>                          Amadeu
>>>>> _______________________________________________
>>>>> gmx-users mailing list
>>>>> gmx-users at gromacs.org
>>>>> http://www.gromacs.org/mailman/listinfo/gmx-users
>>>>> Please don't post (un)subscribe requests to the list. Use the
>>>>> www interface or send it to gmx-users-request at gromacs.org.
>>>
>>>
>> _______________________________________________
>> gmx-users mailing list
>> gmx-users at gromacs.org
>> http://www.gromacs.org/mailman/listinfo/gmx-users
>> Please don't post (un)subscribe requests to the list. Use the
>> www interface or send it to gmx-users-request at gromacs.org.
> -- 
>
> _______________________________________________
> gmx-users mailing list
> gmx-users at gromacs.org
> http://www.gromacs.org/mailman/listinfo/gmx-users
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-request at gromacs.org.




More information about the gromacs.org_gmx-users mailing list