[gmx-users] Parallel FFTW performance

Erik Lindahl lindahl at csb.stanford.edu
Mon Oct 27 02:34:00 CET 2003


No, you'll have to change the grid spacing with the "fourierspacing" 
option and the interpolation order with "pme_order". It's described in 
more detail in the manual.

On Oct 26, 2003, at 4:00 PM, Bing Kim wrote:

> Good to hear that somebody experienced same as me.
> You're saying you would use normal fft instead of mpi version even 
> multiple processor evirionment.
> Then, you might need to have whole charge grid for every cpu.

No, it can be done much smarter. Essentially each of the PME CPUs do 
their local 2D FFTs, but the communication is interleaved with the 

> I just don't understand why FFTW guys can do like this.
> Couldn't you find any other parallel FFT library?

Well, there isn't really any simple way to do it. FFT is an incredibly 
fast operation, so the calculation-to-communication ratio is already 
pretty poor. The algorithm also involves a transpose, which means 
all-to-all communication.

FFTW is probably among the best parallel implementations, but in this 
case we simply have to rework the algorithm so we don't need to 
parallelize it that much.



More information about the gromacs.org_gmx-users mailing list