[Pw_forum] error in the q2r.out

陳 少華 chen_shao_hua197 at yahoo.com.tw
Thu Sep 27 18:02:21 CEST 2007


Dear Axel and Jiayu,

First,the code is parallel version and the PC cluster
is 2cpus/per node .
last error produced under 4 nodes(8cpus) running. 
I saw Jiayu's experienced about this running.So I
tested the same running but under 1 nodes(2cpus).Even
I don't know why Jiayu's experienced can work.
The q2r.out seems fine this time.

But the matdyn.out.dos showed the error:

p1_374: (22.726562) net_recv failed for fd = 3
p1_374:  p4_error: net_recv read, errno = : 104
bm_list_373:  p4_error: listener select: -1
rm_l_1_375: (22.726562) net_send: could not write to
fd=5, errno = 32

And the error.out showed
 
PGFIO-F-223/formatted write/internal file/illegal P, T
or B edit descriptor - value missing.
 In source file matdyn.F90, at line number 1667
    p4_error: latest msg from perror: Bad file
descriptor
mpiexec: Warning: tasks 0-1 exited with status 1.

For matdyn.out.freq ,it showed 

p1_381: (1.125000) net_recv failed for fd = 3
bm_list_380:  p4_error: listener select: -1
p1_381:  p4_error: net_recv read, errno = : 104
rm_l_1_382: (1.125000) net_send: could not write to
fd=5, errno = 32

The error.out  showed 

PGFIO-F-235/formatted write/unit=20/edit descriptor
does not match item type.
 File name = gam.lines    formatted, sequential access
  record = 6
 In source file matdyn.F90, at line number 1797
    p4_error: latest msg from perror: Bad file
descriptor
mpiexec: Warning: tasks 0-1 exited with status 1.

What those means ?

Best Regards
max
Physics Department,National Taiwan University,Taiwan

--- 渴槽鍠 <daijiayu at nudt.edu.cn> 說:

> 
> >CSH> Dear forum's members,
> >CSH> 
> >
> >[...]
> >
> >CSH> p5_16906: (12.808594) net_send: could not
> write to
> >CSH> fd=5, errno = 32
> >CSH> p7_4306: (14.812500) net_send: could not write
> to
> >CSH> fd=5, errno = 32
> >CSH> p3_16405: (14.820312) net_send: could not
> write to
> >CSH> fd=5, errno = 32
> >CSH> 
> >CSH> Does anyone know what it means ?
> >
> >that means, that all your MPI "slave" processes
> died.
> >did you check whether your input runs in serial?
> >
> >cheers,
> >  axel.
> 
> I have the same problem when i run mpirun -np 4 ph.x
> -npool 2 < *.ph.in . But if i
> run mpirun -np 4 ph.x -npool 1 < *.ph.in  or mpirun
> -np 4 ph.x < *.ph.in, the
> error would disappear. Is it the meaning of the
> memory error? what did you mean
> "in serial"?
> 
> Thanks,
> Jiayu
> 
> 
> 
> 
> ------------------------------
> Jiayu Dai
> National University of Defense Technology, P R China
> _______________________________________________
> Pw_forum mailing list
> Pw_forum at pwscf.org
> http://www.democritos.it/mailman/listinfo/pw_forum
> 



      ____________________________________________________________________________________
杜絕網路駭客,保障帳號安全 - 馬上設定 Yahoo!奇摩安全圖章http://tw.info.yahoo.com/seal/index.html


More information about the Pw_forum mailing list