[Pw_forum] Problems running PWscf in parallel
stewart at cnf.cornell.edu
stewart at cnf.cornell.edu
Tue Nov 1 18:12:32 CET 2005
Hi all,
I am having problems getting PWscf to run in parallel on a red-hat linux
cluster using lam-mpi. When I start the calculation, I get the standard
header information along with the number of processors that will be used.
Then the program appears to hang even on simple calculations. I have tried
running it in directories that can be seen by NFS and also on scratch
directories outside the NFS with no luck.
I have set the runs up with a hostfile for 4 nodes with 2 processors each.
Using both:
mpirun -np 8 pw.x < run.in > run.out
mpirun C pw.x < run.in > run.out
I run into the same problem.
When the program starts it tries to start 2 processes on 4 different nodes.
Shortly after this, the second process on each node quits and I am left with
4 pw.x runs on 4 nodes without any output. Memory usage of these processors
is less than 1% so I know they aren't doing any significant work.
I would appreciate any suggestions that people have on why some pw.x
processes are falling out would be greatly appreciated!
Thanks,
Derek
#################################
Derek Stewart, Ph. D.
Scientific Computation Associate
250 Duffield Hall
Cornell Nanoscale Facility (CNF)
Ithaca, NY 14853
stewart (at) cnf.cornell.edu
(607) 255-2856
More information about the Pw_forum
mailing list