[Pw_forum] problem with phonon in parallel

Eduardo Ariel Menendez P emenendez at macul.ciencias.uchile.cl
Fri Mar 11 21:09:45 CET 2005


I have a problem with the use of phonon in parallel. I can run pw.x with
no apparent problem, but  ph.x fails.
For example, I run the example 06 step by step,

mdo_mpi_fast /home/gonzalo/eariel/Compilaciones/Espresso/bin/pw.x < alas.scf.in >alas.scf.out

mdo_mpi_fast /home/gonzalo/eariel/Compilaciones/Espresso/bin/ph.x < alas.phG.in  > alas.phG.out

and this is how the output of ph.x looks like:

MPI_Recv: message truncated (rank 1, comm 4) %Really early problem

     Program PHONON    v.2.1.2  starts ...
     Today is 11Mar2005 at 13:35: 9

     Parallel version (MPI)

     Number of processors in use:       2
     R & G space division:  nprocp =    2

     Ultrasoft (Vanderbilt) Pseudopotentials

     Reading file alas.save ...
     read complete

     Reading file alas.save ...
     read complete

     Planes per process (thick) : nr3 = 20 npp =  10 ncplane =  400


     Atomic displacements:
     There are   2 irreducible representations

     Representation     1      3 modes - To be done

     Representation     2      3 modes - To be done
Rank (1, MPI_COMM_WORLD): Call stack within LAM:
Rank (1, MPI_COMM_WORLD):  - MPI_Recv()
Rank (1, MPI_COMM_WORLD):  - MPI_Bcast()
Rank (1, MPI_COMM_WORLD):  - MPI_Allreduce()
Rank (1, MPI_COMM_WORLD):  - main()

and I receive this error message

One of the processes started by mpirun has exited with a nonzero exit
code.  This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.

PID 27484 failed on node n0 ( with exit status 1.
mpirun failed with exit status 1

Will. This MPI problem is not always so evident.
I also run with the option -npool 2. (there are two k-points in example

In this case the run begins ok
     Program PHONON    v.2.1.2  starts ...
     Today is 11Mar2005 at 14:12:47

     Parallel version (MPI)

     Number of processors in use:       2
     K-points division:     npool  =    2

but it stops at the end
     Electric Fields Calculation

      iter #   1 total cpu time :     0.6 secs   av.it.:   6.3
      thresh= 0.100E-01 alpha_mix =  0.700 |ddv_scf|^2 =  0.585E-05

      iter #   2 total cpu time :     0.9 secs   av.it.:   9.3
      thresh= 0.242E-03 alpha_mix =  0.700 |ddv_scf|^2 =  0.742E-06

      iter #   3 total cpu time :     1.2 secs   av.it.:   9.3
      thresh= 0.861E-04 alpha_mix =  0.700 |ddv_scf|^2 =  0.188E-08

      iter #   4 total cpu time :     1.5 secs   av.it.:  10.0
      thresh= 0.433E-05 alpha_mix =  0.700 |ddv_scf|^2 =  0.120E-10

      iter #   5 total cpu time :     1.8 secs   av.it.:   9.3
      thresh= 0.347E-06 alpha_mix =  0.700 |ddv_scf|^2 =  0.573E-12

     End of electric fields calculation

          Dielectric constant in cartesian axis

          (      27.811916498       0.
One of the processes started by mpirun has exited with a nonzero exit
code.  This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.

PID 27533 failed on node n1 ( due to signal 11.
mpirun failed with exit status 11

I would appreciate any suggestion to locate the cause.

Best regards

Eduardo A. Menendez Proupin
Department of Physics
Faculty of Science
University of Chile
Las Palmeras 3425
Ñuñoa, Santiago
Phone: 56+2+678 74 11

More information about the Pw_forum mailing list