[Pw_forum] Parallel bandstructure calculations

hqzhou hqzhou at nju.edu.cn
Tue Aug 24 04:02:37 CEST 2010


Hi,

You have 32 CPU cores that devided into 4 pools, so you have only
8 CPU cores for each pool. 8 is not a square of any integer, and
it's why it used serial algorithm.

you need to run

mpirun -np 32 pw.x -npool 2 -ndiag 4

BUT, you'd better make tests to determine the best combination.
For my case of a 56 atom system with spin polarization, I found
the best one (parallel efficiency 79%) is 16 CPU cores (Xeon 5550) 
for each pool, serial algorithm, 64 CPU cores in total, that is,

mpirun -np 64 pw.x -npool 4


huiqun zhou
@earth sciences, nanjing university, china


----- 原始邮件 -----
发件人: "Nicki Frank Hinsche" <nicvok at freenet.de>
收件人: "pw forum" <pw_forum at pwscf.org>
发送时间: 星期一, 2010年 8 月 23日 下午 10:32:13
主题: [Pw_forum] Parallel bandstructure calculations

Hi there,

I am currently doing calculations of iso-energy surfaces on doped  
semiconductors. Therefore I generate with an external program a quite  
big k-point mesh for which I want to determine the eigenvalues and  
later on construct the iso-energy surface with a tetrahedron method.  
My problem is the running time of the bandstructure calculation.

The size of the unit (super)-cells is in the order of 30-50 atoms,  
containing 1 or 2 different atomic species. For k-points in the order  
of 4000-6000 the eigenvalues have to be calculated (most often around  
50-100 ev's for each k-point).


After the scf-calculation is done quite fast, I am running the nscf  
bandstructure calc. with the command


mpirun -np 32 pw.x -npool 4 -diag 16


but the calculation isn't done parallel, as the output says:

      Parallel version (MPI), running on    32 processors
      K-points division:     npool     =    4
      R & G space division:  proc/pool =    8

      Subspace diagonalization in iterative solution of the eigenvalue  
problem:
      a serial algorithm will be used


due to this, the calculation runs much longer than 72 hours...to long  
for me and our cluster system


So is there a possibility to parallelize the bandstructure calculation  
efficiently and to reduce tje calculation time?


thanks in advance,

Nicki



-------------------------------------------------------------
Nicki Frank Hinsche, Dipl. Phys.
Institute of physics - Theoretical physics,
Martin-Luther-University Halle-Wittenberg,
Von-Seckendorff-Platz 1, Room 1.07
D-06120 Halle/Saale, Germany
Tel.: ++49 345 5525462
-------------------------------------------------------------
Fellow of the International Max Planck Re-
search School-MPI for Microstructure Physics
-------------------------------------------------------------

_______________________________________________
Pw_forum mailing list
Pw_forum at pwscf.org
http://www.democritos.it/mailman/listinfo/pw_forum


More information about the Pw_forum mailing list