[gmx-developers] Assembler loops on Itanium2 Montecito.
Carsten Kutzner
ckutzne at gwdg.de
Thu Feb 21 21:01:41 CET 2008
Am 21.02.2008 um 20:00 schrieb Rafael R. Pappalardo:
>> Hi Rafael,
>>
>> First - which Gromacs version are you running?
>>
> I am using VERSION 3.3.99_development_20080208.
> Other CVS versions have the same problem.
> Should I try with 3.3.2?
>
>> We experienced something similar a while ago, but I think that was a
>> compiler bug I never resolved.
>
> I have used GCC 4.1, GCC 3.4 and ICC 10.1 with the same results. I
> am not sure
> how the assembler is compiled. Could I change the compiler used to
> compile
> the assembler routines?
>
>>
>> For reference, ia64 isn't particularly fast compared to x86 since we
>> have assembly loops for both :-)
>
> Do you have an idea about the penalty incurred on IA64 by not using
> assembly
> loops? I mean, it's worthwhile to try to solve the problem? I have
> something
> like 20 dual core Itanium2 and my boss will be a bit upset if I
> tell him that
> a normal PC outperform them.
Hi Rafael,
a while ago I also ran into this problem on the Montecito processors.
I have then compiled the code with the fortran inner loops which
turned out to be quite fast on the Itanium architecture. With an
80000 atom test system (protein + membrane + water) I get a single-
processor performance of 0.29 ns/day with the fortan inner loops,
compared to 0.32 ns/day with assembly inner loops (but only 0.12 ns/
day with C inner loops). Those benchmarks were made with Gromacs
3.3.1, but the single-processor performance is essentially the same
as in the CVS version. Note that on a 'normal x86 PC' the assembly
loops are typically a lot (> factor 1.5) faster than the fortran ones.
Carsten
More information about the gromacs.org_gmx-developers
mailing list