[gmx-users] Parallel FFTW performance

Bing Kim abinitiomd at hotmail.com
Mon Oct 27 01:01:01 CET 2003


Good to hear that somebody experienced same as me.
You're saying you would use normal fft instead of mpi version even multiple 
processor evirionment.
Then, you might need to have whole charge grid for every cpu.
So you would lose some time which was saved because you might parallelize 
assign and interpolation
procedure in domain decomposition.
Comparing losing time and saving time.. I don't know what is better 
solution.
I just don't understand why FFTW guys can do like this.
Couldn't you find any other parallel FFT library?
Thanks,
Bing Kim

>From: Erik Lindahl <lindahl at csb.stanford.edu>
>Reply-To: gmx-users at gromacs.org
>To: gmx-users at gromacs.org
>Subject: Re: [gmx-users] Parallel FFTW performance
>Date: Sun, 26 Oct 2003 15:22:12 -0800
>
>Hi,
>
>The problem is that the fourier transform is very fast, so when it is 
>parallelized over a lot of nodes with MPI the communication will kill you.
>
>We're working on this - the solution will simply be to skip FFTW 
>parallelization completely. We will use both threads and MPI in Gromacs, 
>and do all synchronization ourselves. The additional obstacle is that you 
>don't want to use all nodes in your system for the PME part. This is 
>trivial to work around with 3-4 nodes, but with 16+ nodes and domain 
>decomposition it's a pretty complicated problem.
>
>Cheers,
>
>Erik
>
>On Oct 26, 2003, at 2:18 PM, Dean Johnson wrote:
>
>>On Sun, 2003-10-26 at 15:45, Bing Kim wrote:
>>>Hi All!
>>>
>>>I am sorry this question is not for Gromacs exactly but for FFTW.
>>>But this question would raise on Gramacs too.
>>>I recently installed FFTW-2.1.5 which can use MPI.
>>>It was compiled with gcc-3.3.2 and mpich-1.2.5.2.
>>>When I ran a benchmark test program, rfftw_mpi_test, which is located in
>>>fftw-2.1.5/mpi,
>>>I found that its performace is worse in dual cpus than single cpu.
>>>Basically, communication cost should be zero in SMP machine.. that I
>>>expected.
>>>So.. I wonder if gromacs use rfftw_mpi, how it can get speed up in 
>>>multiple
>>>processors.
>>>Please help me understand this thing.
>>>
>>
>>That is also our experience, not just with Gromacs/FFTW, but also with
>>Amber7. We solve that by running two 16x1 models concurrently. The cost
>>of 8x2 is only a little more than 16x1.
>>
>>--
>>
>>	-Dean
>>
>>_______________________________________________
>>gmx-users mailing list
>>gmx-users at gromacs.org
>>http://www.gromacs.org/mailman/listinfo/gmx-users
>>Please don't post (un)subscribe requests to the list. Use the
>>www interface or send it to gmx-users-request at gromacs.org.
>
>_______________________________________________
>gmx-users mailing list
>gmx-users at gromacs.org
>http://www.gromacs.org/mailman/listinfo/gmx-users
>Please don't post (un)subscribe requests to the list. Use the www interface 
>or send it to gmx-users-request at gromacs.org.

_________________________________________________________________
Concerned that messages may bounce because your Hotmail account has exceeded 
its 2MB storage limit? Get Hotmail Extra Storage!         
http://join.msn.com/?PAGE=features/es




More information about the gromacs.org_gmx-users mailing list