[Pw_forum] How to using gamma point calculation with high efficiency

Axel Kohlmeyer akohlmey at cmm.chem.upenn.edu
Sat Jun 28 18:14:41 CEST 2008


On Sat, 28 Jun 2008, vega lew wrote:

dear vega,

[...]

VL> I compile Q-E on my cluster with 10.1.015 vision of intel compilers 
VL> successfully and correctly. Now my cluster can calculate very fast 
VL> when calculating the structure relaxtion with 30-40 k-points. But on 
VL> my cluster which has 5 quad-core CPU, I must using 20 pools to get 
VL> the highest CPU usage (most of time 90%+, but it's unstable.  70%+ 
VL> in average was shown by 'sar' command). Thanks to Axel's advices, I 
VL> set the environmental variable OMP_NUM_THREADS=1. The CPU usage in 
VL> every 5 computers was always the same case. The calculation can be 
VL> achieved fast. If I using 10 or 5 pools the CPU usage can't reach 
VL> that high. Is this up to snuff?

that just means you are not using your MPI library correctly.
QE assumes that consecutive MPI task ids are on the same node.
most MPI libraries decide this based on the host/machine file,
you have to read your MPI documentation how to arrange it the
way it should would. you've seen from my timings, that it should
work quite well. i am using OpenMPI and that package has on top
of using hosts in the hostfile order the option to explicitely
follow a round-robin or a by-node scheme when assigning MPI tasks.

VL> After testing the lattice optimizations, another questions rises. I 
VL> need to calculate the surface structure with gamma point only, 
VL> because of the system composed of ~80 atoms ( scientists always 
VL> calculate gamma point optimization in my area of researching). But 

whether or not a gamma point calculation is sufficient is not
determined by what other people do, but by the physics of the
problem. that reminds me, you still owe us an explanation about
the difference between "commercial" and "academic" calculations.
i am _very_ curious about this.

VL> when I calculate the surface structure with gamma point only, I 
VL> couldn't use many pools. Therefore the cpu usage for gamma point 
VL> calculation is coming down, about ~20% again. How could I 
VL> calculation with a high cpu useage?

you cannot. as you should have already seen from the numbers
i sent you. g-space parallelization doesn't scale across gigabit
ethernet using TCP/IP networking. just run the job within one
node and then you can run 5 jobs at the same time.

axel.

VL> 
VL> Thank you for reading.
VL> 
VL> regards, 
VL> 
VL> Vega Lew
VL> PH.D Candidate in Chemical Engineeringhh
VL> State Key Laboratory of Materials-oriented Chemical Engineering
VL> College of Chemistry and Chemical Engineering
VL> Nanjing University of Technology, 210009, Nanjing, Jiangsu, China
VL> _________________________________________________________________
VL> Invite your mail contacts to join your friends list with Windows Live Spaces. It's easy!
VL> http://spaces.live.com/spacesapi.aspx?wx_action=create&wx_url=/friends.aspx&mkt=en-us

-- 
=======================================================================
Axel Kohlmeyer   akohlmey at cmm.chem.upenn.edu   http://www.cmm.upenn.edu
   Center for Molecular Modeling   --   University of Pennsylvania
Department of Chemistry, 231 S.34th Street, Philadelphia, PA 19104-6323
tel: 1-215-898-1582,  fax: 1-215-573-6233,  office-tel: 1-215-898-5425
=======================================================================
If you make something idiot-proof, the universe creates a better idiot.


More information about the Pw_forum mailing list