[Pw_forum] Re: Comparison of 3.1.1 and 3.2 (cvs)
Giovanni Cantele
Giovanni.Cantele at na.infn.it
Fri Nov 24 14:58:59 CET 2006
Eduardo Ariel Menendez P wrote:
> Hello,
> I posted this yesterday, but it seems to have gone to /dev/null.
> I insist because I think it is important to benchmark with a "large" job.
> **********************************
> I would like to add that the scaling of jobs that take a few minutes of
> CPU is totally different to jobs of a few hours. For instance, for a small
> calculation with 1 or 2 atoms using the pw.x compiled and linked to
> the Intel MKL library makes no
> difference with compiling the source, at least if the Intel compiler is
> used. However, for a job of 64
> atoms that runs in 2 hours or more, linking either to MKL or to the
> compiled from source BLAS/LAPCK, the difference can be a factor of 5.
> This is an example
> /home/eduardo/Chemutils/Espresso/espresso-cvs/espresso/bin-ifort-serial-fftwmkl/pw.x
> <cdte.mdb.in >> cdte.scf10.out
> real 149m1.400s
> user 147m39.509s
> sys 1m21.829s
> /home/eduardo/Chemutils/Espresso/espresso-cvs/espresso/bin-ifort-serial/pw.x
> <cdte.mdb.in >> cdte.scf10.out
> real 627m25.528s
> user 626m4.072s
> sys 1m22.537s
> Using the fftw of MKL and using the source makes no difference, at least I
> have not found any improvement, but the BLAS/LAPACK is important. In
> summary, do benchmark for large systems, because the scaling of a small
> system is not important by itself, and it cannot be extrapolated (I
> cannot say why).
> Eduardo
>
Hi,
I was just wondering why my experience gave exactly the opposite results
as the one posted yesterday,
namely CVS (3.2) version faster than 3.1.1.
My run was on 8 alpha ev7 CPUs, 1150 MHz.
I found CPU time of 2h7m with 3.1.1, 1h23m (1h26m wall time) using cvs.
Please note that in 3.1.1 parallel diagonalization was used, whereas in
cvs it wasn't! Does it make sense?
The runs was relaxation of an Sr-terminated SrTiO3-110 surface,
involving 19 atoms (152 electrons):
ibrav=8
celldm(1) = 7.37
celldm(2) = 1.41421356
celldm(3) = 6.717514421
...
ecutwfc=30.0
ecutrho=180.0
...
K_POINTS { automatic }
4 4 1 1 1 0
(due to symmetry the calculation reduces to 4 k-points)
I've tried to make the same test as the one reported in the forum
(mgal2o4-cf.scf.in) and got the following results (CPU time / wall time):
#CPU 3.1.1 CVS
1 9m58 s/ 10m12s 11m02s / 11m23s
2 5m34s / 5m46s 7m03s / 7m12s
4 2m54s / 2m60s 3m32s / 3m38s
In this case CVS is slower than 3.1.1, indeed.
Giovanni
--
Dr. Giovanni Cantele
Coherentia CNR-INFM and Dipartimento di Scienze Fisiche
Universita' di Napoli "Federico II"
Complesso Universitario di Monte S. Angelo - Ed. G
Via Cintia, I-80126, Napoli, Italy
Phone: +39 081 676910
Fax: +39 081 676346
E-mail: Giovanni.Cantele at na.infn.it
Web: http://people.na.infn.it/~cantele
More information about the Pw_forum
mailing list