Hi, <br>I have found a problem with a benchmark calculation with pw.x . It aborts when running with the -npool option.<br><br>However, it runs normally without the npool option. The calculation uses 34 k-points. <br>
<br><br>I have tried with <br>mpirun -np 32 pw.x -npool 2<br>mpirun -np 32 pw.x -npool 4<br>mpirun -np 4 pw.x -npool 2<br>mpirun -np 4 pw.x -npool 2 <br><br>in two machines, using OpenMPI and HPMPI, single nodes and multiples nodes. All the times it fails.<br>
<br>Looking at the output I see the following messages<br>
<br>%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%<br> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%<br>     from addusdens_r : error #         1<br>     from addusdens_r : error #         1<br>

     expected  360.00000000, found  101.25021916: wrong charge, increase ecutrho<br>     from addusdens_r : error #         1<br>     expected  360.00000000, found  101.25021916: wrong charge, increase ecutrho<br>     expected  360.00000000, found  101.25021916: wrong charge, increase ecutrho<br>

 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%<br><br> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%<br> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%<br>

     stopping ...<br><br><br>     stopping ...<br>     stopping ...<br> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%<br>     from addusdens_r : error #         1<br>     expected  360.00000000, found  101.25021916: wrong charge, increase ecutrho<br>

 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%<br><br>     stopping ...<br><br>the previous output was with -npool 4. Using -npool 2 I get outpits like this<br><br> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%<br>
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%<br>     from addusdens_r : error #         1<br>     expected  360.00000000, found  180.00040683: wrong charge, increase ecutrho<br>     from addusdens_r : error #         1<br>
     expected  360.00000000, found  180.00040683: wrong charge, increase ecutrho<br><br>It looks like hving problems to sum the charge densities of the pools.<br><br>Here is the input (the pseudos are from the web site)<br>
<br> &amp;CONTROL<br>                 calculation = &#39;scf&#39; ,<br>
                restart_mode = &#39;from_scratch&#39; ,<br>                       outdir = &#39;.&#39; ,<br>                   pseudo_dir = &#39;.&#39; <br>                      prefix = &#39;cdsebench&#39; ,<br>                      wfcdir = &#39;/tmp&#39;,<br>

 /<br> &amp;SYSTEM<br>                       ibrav = 0,<br>                   celldm(1) = 1.8897261,<br>                         nat = 40,<br>                        ntyp = 2,<br>                     ecutwfc = 30.0 ,<br>
                     ecutrho = 180.0 ,<br>
                 occupations = &#39;smearing&#39; ,<br>                     degauss = 0.02 ,<br>                    smearing = &#39;gaussian&#39; ,<br>!                   qcutz=150., q2sigma=2.0, ecfixed=24.0<br> /<br> &amp;ELECTRONS<br>

            electron_maxstep = 60,<br>                    conv_thr = 1.0D-6 ,<br>                 startingpot = &#39;atomic&#39; ,<br>                 startingwfc = &#39;random&#39; ,<br>!                 mixing_mode = &#39;TF&#39; ,<br>

                 mixing_beta = 0.7D0,<br>                 mixing_beta = 0.7D0,<br>              diagonalization = &#39;david&#39; ,<br>                        tqr = .true.<br> /<br><br>CELL_PARAMETERS hexagonal<br>     4.373836756  0.000000000  0.000000000<br>

    -2.187115631  3.787739854  0.000000000<br>     0.000000000  0.000000000 71.411016248<br><br>ATOMIC_SPECIES<br>   Cd  112.41000  Cd.pbe-van.UPF<br>   Se  78.960000  Se.pbe-van.UPF<br><br>ATOMIC_POSITIONS (crystal)<br>
Cd    0.000000000    0.000000000    0.000000000<br>
Cd    0.000000000    0.000000000    0.100000000<br>Cd    0.000000000    0.000000000    0.200000000<br>Cd    0.000000000    0.000000000    0.300000000<br>Cd    0.000000000    0.000000000    0.400000000<br>Cd    0.000000000    0.000000000    0.500000000<br>

Cd    0.000000000    0.000000000    0.600000000<br>Cd    0.000000000    0.000000000    0.700000000<br>Cd    0.000000000    0.000000000    0.800000000<br>Cd    0.000000000    0.000000000    0.900000000<br>Cd    0.666663821    0.333312908    0.050000151<br>

Cd    0.666663821    0.333312908    0.150000151<br>Cd    0.666663821    0.333312908    0.250000151<br>Cd    0.666663821    0.333312908    0.350000151<br>Cd    0.666663821    0.333312908    0.450000151<br>Cd    0.666663821    0.333312908    0.550000151<br>

Cd    0.666663821    0.333312908    0.650000151<br>Cd    0.666663821    0.333312908    0.750000151<br>Cd    0.666663821    0.333312908    0.850000151<br>Cd    0.666663821    0.333312908    0.950000151<br>Se    0.000000000    0.000000000    0.037565413<br>

Se    0.000000000    0.000000000    0.137565413<br>Se    0.000000000    0.000000000    0.237565413<br>Se    0.000000000    0.000000000    0.337565413<br>Se    0.000000000    0.000000000    0.437565413<br>Se    0.000000000    0.000000000    0.537565413<br>

Se    0.000000000    0.000000000    0.637565413<br>Se    0.000000000    0.000000000    0.737565413<br>Se    0.000000000    0.000000000    0.837565413<br>Se    0.000000000    0.000000000    0.937565413<br>Se    0.666665290    0.333319516    0.087565394<br>

Se    0.666665290    0.333319516    0.187565394<br>Se    0.666665290    0.333319516    0.287565394<br>Se    0.666665290    0.333319516    0.387565394<br>Se    0.666665290    0.333319516    0.487565394<br>Se    0.666665290    0.333319516    0.587565394<br>

Se    0.666665290    0.333319516    0.687565394<br>Se    0.666665290    0.333319516    0.787565394<br>Se    0.666665290    0.333319516    0.887565394<br>Se    0.666665290    0.333319516    0.987565394<br>K_POINTS automatic<br>

8 8 1 0 0 0<br><br><br>Testing the speed, version 4.1 is a bit slower than 4.0.4 (about 9% more time in this benchmark: 39.5 vs 36 minutes, using 32 cpus). I guess this is irrelevant in face of Moore&#39;s law.<br><br><br>
<br>Thank you,<br>Best regards<br clear="all"><br>-- <br>Eduardo Menendez<br>Departamento de Fisica<br>Facultad de Ciencias<br>Universidad de Chile<br>Phone: (56)(2)9787439<br>URL: <a href="http://fisica.ciencias.uchile.cl/%7Eemenendez" target="_blank">http://fisica.ciencias.uchile.cl/~emenendez</a><br>