[gmx-developers] MASM assembly syntax

chris.neale at utoronto.ca chris.neale at utoronto.ca
Mon Jul 30 10:38:14 CEST 2007


This is not a particular request, but may be an interesting starting  
point for those wanting to tweak the speed of gromacs. This thread  
seemed like a reasonable place to post this information.

Regarding compilers, what about Sun Studio 12 (for linux boxes)? They  
tell our sys admin that they can acheive 25% faster runs than icc or  
gcc. However, their assembler doesn't handle the gromacs assembly  
code. I have assembled the assembly portion using gcc and then compile  
using sun studio by escaping the initial (see mothod at end). It runs  
fine, but doesn't go any faster than gcc alone.

Below I list compilation instructions and also at the end some  
benchmarks that I ran in case this interests anybody.

Here are the instructions that I followed:
"To do this, simply configure as normal and ctrl-C after make
starting compiling files for src/gmxlib/nonbonded/nb_kernel/ (much less than a
minute). Then remove *.o and *.lo under that directory and then change  
mov to movq on line #29140 of "configure", you will then be
able to configure gromacs successfully.

$ diff ../configure.new configure
29140c29140
<         movq    %rsp, %rbp
---

          mov     %rsp, %rbp

For the entirely-non-assembly compilation test, I passed  
--disable-x86-64-sse to configure

Here is the generic instructions that I followed to  compile scalar  
gromacs with Sun Studio 12.

export PATH=/tools/SunStudio/12/sunstudio12/bin:$PATH
./configure --disable-float CC=cc CXX=CC F77=f77  
CFLAGS="-xtarget=opteron -xarch=sse2 -xprefetch -xprefetch_level=2  
-xvector=simd -xdepend=yes -xbuiltin=%all -xO5 -D_Bool=bool"  
CXXFLAGS="-xtarget=opteron -xarch=sse2 -xprefetch -xprefetch_level=2  
-xvector=simd -xdepend=yes -xbuiltin=%all -xO5 -D_Bool=bool"  
FFLAGS="-xtarget=opteron -xarch=sse2 -xprefetch -xprefetch_level=2  
-xvector=simd -stackvar -xO5 -stackvar -mt -dalign -fpp"  
CPPFLAGS=-I/tools/gromacs/fftw-3.0.1/include  
LDFLAGS=-L/tools/gromacs/fftw-3.0.1/lib  
--prefix=/tools/gromacs/3.3.1ss --disable-x86-64-sse
make
make install

#######

The benchmarks are below.  I don't know why my compilation didn't  
improve with the gcc assembly option. I also don't know why there is  
such a similar speed for single and double precision without the  
assembly loops.

Test system:
Protein in water
38,060 atoms
Ewald, and other regular simulation conditions used.

Single Precision:
gcc (0.869, 0.914, 0.973ns/day)
gcc+noAssembly (0.574ns/day)

sun+gccAssembly (0.509, 0.479ns/day)
sun+noAssembly (0.526, 0.518ns/day)


Double Precision:
gcc (0.591, 0.601ns/day)
gcc+noAssembly  -- not tested

sun+gccAssembly (0.622ns/day)
sun+noAssembly (0.491ns/day)

Quoting Mark Abraham <Mark.Abraham at anu.edu.au>:

> Utkal Ranjan Pradhan wrote:
>> Hi Friends
>>
>> Any pointer to convert the Gromacs 3.3.1 assembly loop codes (AT&T   
>> and Intel Syntax) to MASM syntax ?
>>
>> So that we can use MS MASM assembler (ml.exe/ml64.exe) to build   
>> Gromacs for native windows.
>
> Probably nobody cares enough about Windows as a high-end compute
> platform to want to do this. You might try Googling for the intel2gas
> tool. Gromacs installs readily with cygwin, see
> http://wiki.gromacs.org/index.php/GROMACS_on_Windows
>
> Mark
> _______________________________________________
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://www.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the www
> interface or send it to gmx-developers-request at gromacs.org.






More information about the gromacs.org_gmx-developers mailing list