[gmx-developers] Assembler loops on Itanium2 Montecito.

Carsten Kutzner ckutzne at gwdg.de
Thu Feb 21 21:01:41 CET 2008


Am 21.02.2008 um 20:00 schrieb Rafael R. Pappalardo:

>> Hi Rafael,
>>
>> First - which Gromacs version are you running?
>>
> I am using VERSION 3.3.99_development_20080208.
> Other CVS versions have the same problem.
> Should I try with 3.3.2?
>
>> We experienced something similar a while ago, but I think that was a
>> compiler bug I never resolved.
>
> I have used GCC 4.1, GCC 3.4 and ICC 10.1 with the same results. I  
> am not sure
> how the assembler is compiled. Could I change the compiler used to  
> compile
> the assembler routines?
>
>>
>> For reference, ia64 isn't particularly fast compared to x86 since we
>> have assembly loops for both :-)
>
> Do you have an idea about the penalty incurred on IA64 by not using  
> assembly
> loops? I mean, it's worthwhile to try to solve the problem? I have  
> something
> like 20 dual core Itanium2 and my boss will be a bit upset if I  
> tell him that
> a normal PC outperform them.

Hi Rafael,

a while ago I also ran into this problem on the Montecito processors.  
I have then compiled the code with the fortran inner loops which  
turned out to be quite fast on the Itanium architecture. With an  
80000 atom test system (protein + membrane + water) I get a single- 
processor performance of 0.29 ns/day with the fortan inner loops,  
compared to 0.32 ns/day with assembly inner loops (but only 0.12 ns/ 
day with C inner loops).  Those benchmarks were made with Gromacs  
3.3.1, but the single-processor performance is essentially the same  
as in the CVS version.  Note that on a 'normal x86 PC' the assembly  
loops are typically a lot (> factor 1.5) faster than the fortran ones.

Carsten



More information about the gromacs.org_gmx-developers mailing list