[Pw_forum] parallel FFTs - important MPI considerations
Konstantin Kudin
konstantin_kudin at yahoo.com
Mon Feb 6 22:06:43 CET 2006
--- Paolo Giannozzi <giannozz at nest.sns.it> wrote:
>
> "mpi_alltoallv" is used instead of "mpi_alltoall" (without 'v')
> because
> the slices into which the r-space grid is cut along direction 3 may
> not be all of the same size. Of course the special case in which all
> the slices are equal could be quite easily treated with mpi_alltoall,
> if it is really useful; or one could do copies and use mpi_alltoall.
> I would expect the library to take care of this, though.
Well, here is an interesting paper that compares "alltoall" with
"alltoallv", and it seems like optimizing the former is way easier
because the communication pattern is fully known:
http://www.spscicomp.org/ScicomP11/Presentations/User/booth.pdf
I've been playing here with the skampi benchmark
http://liinwww.ira.uka.de/~skampi/skampi4.1.tar.gz, and both for MPICH
and Open-MPI the results suck the least when "alltoall" is employed.
This Open-MPI is actually extremely nice for SMP+Gigabit, and behaves a
lot more consistently than MPICH1. In their latest CVS the "alltoall"
already works reasonably well, while "isend-irecv" and "alltoallv" suck
really really badly for 8 cpus. While they promised to look at the
issue, I think sticking with "alltoall" is the optimal long term
strategy if QE is going to rely mostly on free MPI implementations.
MPICH1 also works best with "alltoall".
Here are the numbers for this:
http://www.princeton.edu/~kkudin/skampi_mpich.txt
http://www.princeton.edu/~kkudin/skampi_openmpi.txt
Thus going to "alltoall" for everything FFT related could be very
helpful for scaling with Gigabit ...
Kostya
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
More information about the Pw_forum
mailing list