[Pw_forum] How to tune cputimes using mpich2-1.0.8
Mahmoud Payami Shabestari
mpayami at aeoi.org.ir
Sat Feb 21 20:18:22 CET 2009
>
> how are the timings if you don't use all 8 cores?
> does the jobs get faster again?
>
N_core tot cputime for one iter
------- -----------------------
1 398.89 sec
2 200.14
3 134.61
4 101.46
5 85.24
6 71.60
7 63.31
8 57.16
Is it surprising?
> if you are still running the same "benchmark" that
> you were running before. your comparisons are most
> likely severely flawed.
No, axel! It was a two-atom Au cluster with "relax" calculations ON
pentium 4, 3.2GHz, dual core. Now it is SLAB.
> you never could prove to me,
> that you are running a correctly compiled executable
> and mpi installation. so you may be comparing apples
> and oranges. you openmpi timings are highly suspicious.
>
I do not want to prove anything. I just announce my experience.
Everybody interested can verify by him/herself.
> i was showing you, that openmpi _does_ behave properly
> on an example that does specifically test MPI performance
> and not depend on anything else (like NFS i/o).
I agree with you. In that case you were comparing apple with orange!
>
> MP> > if you see these kinds of differences, then there is something
> MP> > else causing problems.
> MP> >
> MP> > are you using processor and memory affinity with openmpi?
> MP>
> MP> I have no idea on these concepts. I just use (practice as a good(?)
> MP> student) what you taught me during the hpc08.
>
> processor affinity is tying a process to a specific CP.
> in multi-processor/multi-core environments, this has
> severe performance implications, as it improves cpu
> cache utilization. just stick those keyword into google
> and you'll see.
>
> MP> >
> MP> > what kind of processor is this exactly?
> MP> It is 5420.
>
> ok. so that is intel quad-core. i have a bunch of 5430s
> available to me. please redo those tests with the 32-water
> cp.x input from example21 of the Q-E distribution. and
> then we can start dicussing seriously. for as long as
> nobody can reproduce your benchmarks, they are useless.
I do not have any experience with CPMD.
>
> also you still have a huge difference between wall
> clock and cpu clock. in short, you are trying to solve
> the least important problem first.
>
> i'd kindly ask to not to make claims about mpi implementations
> being "better" unless you can prove that the difference in
> timings are really due to the mpi implementation and not due
> to improper use of the machine or inadequate hardware.
I just expressed my findings, and tried to share it.
regards,mahmoud
>
> cheers,
> axel.
>
>
>
> MP>
> MP> regards,mahmoud
> MP>
> MP>
> MP> > axel.
> MP> >
> MP> > MP>
> MP> > MP> Cheers,
> MP> > MP> mahmoud
> MP> > MP>
> MP> > MP>
> MP> >
> MP> > --
> MP> >
> =======================================================================
> MP> > Axel Kohlmeyer akohlmey at cmm.chem.upenn.edu
> http://www.cmm.upenn.edu
> MP> > Center for Molecular Modeling -- University of
> Pennsylvania
> MP> > Department of Chemistry, 231 S.34th Street, Philadelphia, PA
> 19104-6323
> MP> > tel: 1-215-898-1582, fax: 1-215-573-6233, office-tel:
> 1-215-898-5425
> MP> >
> =======================================================================
> MP> > If you make something idiot-proof, the universe creates a better
> idiot.
> MP>
>
> --
> =======================================================================
> Axel Kohlmeyer akohlmey at cmm.chem.upenn.edu http://www.cmm.upenn.edu
> Center for Molecular Modeling -- University of Pennsylvania
> Department of Chemistry, 231 S.34th Street, Philadelphia, PA 19104-6323
> tel: 1-215-898-1582, fax: 1-215-573-6233, office-tel: 1-215-898-5425
> =======================================================================
> If you make something idiot-proof, the universe creates a better idiot.
More information about the Pw_forum
mailing list