[Pw_forum] -nimage -npool -ntg -ndiag

Paolo Giannozzi giannozz at democritos.it
Sun Apr 26 18:36:34 CEST 2009


Hi Eduardo

> mpirun -np 4096 ./pw.x -nimage 8 -npool 2 -ntg 8 -ndiag 144 -input  
> myinput.in

presently, -nimage is relevant only for NEB calculations; -npools,  
only in presence
of k-points. In the following I am assuming nimage=npool=1.
> I would like to see some hints, in addition to what  is reproduced  
> below (from the users guide), about the good choices of -ntg and - 
> ndiag
>

-ntg is useful if the number of processors you want to use is ~ or >   
than nr3s
(fft dimension along axis 3). In that case it may be interesting to  
perform FFTs in
parallel on ntg grous of electronic states, parallelizing each FFT on  
np/ntg procs.

-ndiag should be such that ndiag^2 <= np . The default is the largest  
possible
ndiag; the ideal choice is the one that gives the best performances  
and depends
upon many factors (dimension of the matrices to be manipulated,  
communication
hardware, phase of the moon...). The larger ndiag, the smaller the  
memory usage
(all matrices that are manipulated are distributed across ndiag^2  
processors) and
the smaller the amount of floating point operations per processor,  
but eventually
the overall performances will flatten and then degrade due to  
communication
overhead. Don't even think trying this on slow communication hardware.

Some time ago I was asked to write the proceedings for some Italian  
conference
on high-performance computing. As a rule I don't write anything that  
doesn't
contain something new or at least something useful. The result of my  
and Carlo
Cavazzoni's effort to reduce the waste of bits is the following small  
paper:
    http://www.fisica.uniud.it/~giannozz/Papers/rimini08.pdf
containing some information on parallelization levels in quantum- 
espresso.
Whether it is useful or not, the reader will judge.

Paolo


More information about the Pw_forum mailing list