[gmx-developers] MASM assembly syntax
chris.neale at utoronto.ca
chris.neale at utoronto.ca
Mon Jul 30 10:38:14 CEST 2007
This is not a particular request, but may be an interesting starting
point for those wanting to tweak the speed of gromacs. This thread
seemed like a reasonable place to post this information.
Regarding compilers, what about Sun Studio 12 (for linux boxes)? They
tell our sys admin that they can acheive 25% faster runs than icc or
gcc. However, their assembler doesn't handle the gromacs assembly
code. I have assembled the assembly portion using gcc and then compile
using sun studio by escaping the initial (see mothod at end). It runs
fine, but doesn't go any faster than gcc alone.
Below I list compilation instructions and also at the end some
benchmarks that I ran in case this interests anybody.
Here are the instructions that I followed:
"To do this, simply configure as normal and ctrl-C after make
starting compiling files for src/gmxlib/nonbonded/nb_kernel/ (much less than a
minute). Then remove *.o and *.lo under that directory and then change
mov to movq on line #29140 of "configure", you will then be
able to configure gromacs successfully.
$ diff ../configure.new configure
29140c29140
< movq %rsp, %rbp
---
mov %rsp, %rbp
For the entirely-non-assembly compilation test, I passed
--disable-x86-64-sse to configure
Here is the generic instructions that I followed to compile scalar
gromacs with Sun Studio 12.
export PATH=/tools/SunStudio/12/sunstudio12/bin:$PATH
./configure --disable-float CC=cc CXX=CC F77=f77
CFLAGS="-xtarget=opteron -xarch=sse2 -xprefetch -xprefetch_level=2
-xvector=simd -xdepend=yes -xbuiltin=%all -xO5 -D_Bool=bool"
CXXFLAGS="-xtarget=opteron -xarch=sse2 -xprefetch -xprefetch_level=2
-xvector=simd -xdepend=yes -xbuiltin=%all -xO5 -D_Bool=bool"
FFLAGS="-xtarget=opteron -xarch=sse2 -xprefetch -xprefetch_level=2
-xvector=simd -stackvar -xO5 -stackvar -mt -dalign -fpp"
CPPFLAGS=-I/tools/gromacs/fftw-3.0.1/include
LDFLAGS=-L/tools/gromacs/fftw-3.0.1/lib
--prefix=/tools/gromacs/3.3.1ss --disable-x86-64-sse
make
make install
#######
The benchmarks are below. I don't know why my compilation didn't
improve with the gcc assembly option. I also don't know why there is
such a similar speed for single and double precision without the
assembly loops.
Test system:
Protein in water
38,060 atoms
Ewald, and other regular simulation conditions used.
Single Precision:
gcc (0.869, 0.914, 0.973ns/day)
gcc+noAssembly (0.574ns/day)
sun+gccAssembly (0.509, 0.479ns/day)
sun+noAssembly (0.526, 0.518ns/day)
Double Precision:
gcc (0.591, 0.601ns/day)
gcc+noAssembly -- not tested
sun+gccAssembly (0.622ns/day)
sun+noAssembly (0.491ns/day)
Quoting Mark Abraham <Mark.Abraham at anu.edu.au>:
> Utkal Ranjan Pradhan wrote:
>> Hi Friends
>>
>> Any pointer to convert the Gromacs 3.3.1 assembly loop codes (AT&T
>> and Intel Syntax) to MASM syntax ?
>>
>> So that we can use MS MASM assembler (ml.exe/ml64.exe) to build
>> Gromacs for native windows.
>
> Probably nobody cares enough about Windows as a high-end compute
> platform to want to do this. You might try Googling for the intel2gas
> tool. Gromacs installs readily with cygwin, see
> http://wiki.gromacs.org/index.php/GROMACS_on_Windows
>
> Mark
> _______________________________________________
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://www.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the www
> interface or send it to gmx-developers-request at gromacs.org.
More information about the gromacs.org_gmx-developers
mailing list