[Pw_forum] FFT & MPI - issues with latency

Konstantin Kudin konstantin_kudin at yahoo.com
Sun Jan 29 18:41:08 CET 2006


 Hi all,

 I am wondering if anybody has investigated systematically the
performance of different MPI implementations with the QE package. Was
anyone successful in using OPEN-MPI which is the project spawned off by
the LAM-MPI people ?

 The issue seem to be that FFTs are very latency driven, and here are
some tests from the past
http://www.hpc.sfu.ca/bugaboo/fft-performance.html which seem to
indicate that MPICH and LAM-MPI differ significantly with respect to
their performance as it applies to the FFTW library.

 For the largest 2d transform (1600x1600) it appears that LAM-MPI is
way better than MPICH on 8 cpus. These are pretty old versions, but
anyway, it is some quantitative data which is interesting to look at.

 The experience we have here with QE on dual Opterons is that there is
no gain in the cpu time at all if one goes above 2 nodes with 2 cpus
using Gigabit. CPMD on this Opteron cluster behaved pretty much
identically to CP, thus indicating that the problem transends slight
differences in implementations. Nicola's bencharks with LAM-MPI [
http://nnn.mit.edu/ESPRESSO/CP90_tests/CP90.timings.large ] seemed to
indicate that there was some improvement after 4 cpus.

 So now I am starting to think that MPICH may be responsible for no
improvement at all above 4 cpus.

 Ideas ?

 Kostya

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 



More information about the Pw_forum mailing list