[Pw_forum] QE with ESSL on BlueGene
Axel Kohlmeyer
akohlmey at cmm.chem.upenn.edu
Wed Nov 14 06:44:08 CET 2007
On Wed, 14 Nov 2007, Dr Brent Walker wrote:
BW> Hi all,
hi brent,
BW> Does anyone know whether QE has been used successfully with the IBM
BW> ESSL libraries on the Blue Gene L architecture (running linux)? Google
BW> unfortunately hasn't provided me with much in relation to this.
yep. i've managed to compile and run QE on a BG/L.
BW> I have spent some time trying to get QE (well really PWscf) running on
BW> such a machine and am at the stage of deciding whether to persevere or
BW> just give up and use locally compiled versions of fftw and lapack/blas
BW> (following say the provided "Make.bgl" file, which seems to work fine
BW> for me).
due to the (lack of) features in the BG/L cpu, you may actually
get reasonable performance with regular BLAS/LAPACK. you can
try the "double hummer" libraries, but then you are limited to
coprocessor mode. this is probably needed anyways, because the
limitations to jobs on BG/L are very hard. there is no local
storage, so pw.x with its default setting of storing wavefunctions
as files, is not scaling well. you'll have to use the (experimental)
feature of storing those in memory, but with 512MB/node there is
not much memory available. on top of that, the cpus on BG/L are
very slow, so you need to parallelize across a large number of
cpus to get decent performance. in my view for a code like pw.x
it is currently not worth the hassle. your chances with cp.x
are much better, but then again, you are limited by the supported
feature set of cp.x. altogether, you have to keep in mind, that
BG/L is mainly a machine to get a good ranking in the top500
and thus please administrators, politicians and generally people
who are not using it. from the user's perspective it is a constant
struggle and a PITA. if i had the choice, i'd rather skip the
top500 placement and get a machine that is usable. the majority
of QE jobs are run on rather small clusters, so to run well on
those machines is where most of the effort goes.
BW> Is this worth pursuing or should I just file it in the "too hard"
BW> basket for the time being? If people think there is some hope that I
BW> can get this to work, I'll provide more details (make.sys, etc.).
BW>
BW> Thanks very much for any information/thoughts/anecdotes on this!
well, i've been struggling a lot with finding _any_ project that
runs well on a BG/L that does not run better on a cray xt3/xt4
or even a reasonably well laid out PC cluster with DDR infiniband.
my best results were so far with classical MD using LAMMPS on
systems that have no coulomb interactions. there i am scaling
out on the BG/L at half the performance of the scaleout timing
on a cray xt3. for most codes, particularly plane wave
pseudopotential DFT the difference is about a factor of 10.
so before putting in more effort, it might be worth to discuss
what kind of calculations you intend to run and how much
cpu time across how many nodes you have at your disposal.
cheers,
axel.
BW>
BW> Brent.
BW>
BW> PS. I have noted AK's comment "good luck (you'll be needing it)"
BW> regarding compilation of QE on BG/L on 31 Aug, which of course doesn't
BW> bode well!
BW>
BW>
--
=======================================================================
Axel Kohlmeyer akohlmey at cmm.chem.upenn.edu http://www.cmm.upenn.edu
Center for Molecular Modeling -- University of Pennsylvania
Department of Chemistry, 231 S.34th Street, Philadelphia, PA 19104-6323
tel: 1-215-898-1582, fax: 1-215-573-6233, office-tel: 1-215-898-5425
=======================================================================
If you make something idiot-proof, the universe creates a better idiot.
More information about the Pw_forum
mailing list