It seems that you provided wrong information:<br><br>1. You shown only 8 CPUs. Where are the other 8 as you were talking about 16 CPUs job.<br>2. There is only one task actually running ( PID 4404, run as root)<br>3. There&#39;s absolutely no other thing running in this node.<br>


<br>My guess is that you took those information from wrong node, probably from head node. <br><br>:) <br><br><div class="gmail_quote">2009/9/11 Giovanni Cantele <span dir="ltr">&lt;<a href="mailto:Giovanni.Cantele@na.infn.it">Giovanni.Cantele@na.infn.it</a>&gt;</span><br>


<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div><div></div><div class="h5">wangqj1 wrote:<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

Dear pwscf users<br>

I use 16 CPUs to run a job,but the usage of CPUs is very slow ,it as like:<br>

Tasks: 179 total, 1 running, 178 sleeping, 0 stopped, 0 zombie<br>

Cpu0 : 0.0%us, 4.8%sy, 0.0%ni, 90.3%id, 0.3%wa, 0.0%hi, 4.5%si, 0.0%st<br>

Cpu1 : 0.0%us, 1.0%sy, 0.0%ni, 95.8%id, 3.3%wa, 0.0%hi, 0.0%si, 0.0%st<br>

Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st<br>

Cpu3 : 0.7%us, 0.0%sy, 0.0%ni, 99.0%id, 0.3%wa, 0.0%hi, 0.0%si, 0.0%st<br>

Cpu4 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st<br>

Cpu5 : 0.0%us, 0.3%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st<br>

Cpu6 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st<br>

Cpu7 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st<br>

Mem: 8048812k total, 7995856k used, 52956k free, 283692k buffers<br>

Swap: 4192956k total, 124k used, 4192832k free, 7492420k cached<br>

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND<br>

4522 root 15 -5 0 0 0 S 5 0.0 17:59.16 nfsd<br>

2012 root 15 -5 0 0 0 S 1 0.0 6:29.01 kjournald<br>

4404 root 20 0 104m 17m 5044 S 0 0.2 55:07.93 X<br>

4519 root 15 -5 0 0 0 S 0 0.0 18:05.58 nfsd<br>

4521 root 15 -5 0 0 0 S 0 0.0 16:55.33 nfsd<br>

5023 gdm 20 0 235m 31m 11m S 0 0.4 13:39.99 gdm-simple-gree<br>

1 root 20 0 1064 408 348 S 0 0.0 0:02.90 init<br>

2 root 15 -5 0 0 0 S 0 0.0 0:00.02 kthreadd<br>

3 root RT -5 0 0 0 S 0 0.0 0:00.00 migration/0<br>

4 root 15 -5 0 0 0 S 0 0.0 0:00.22 ksoftirqd/0<br>

<br>

The ifort ,MKL,and mpi I used is :<br>

INTFC=/opt/intel/Compiler/11.0/081<br>

INTMKL=/opt/intel/mkl/<a href="http://10.1.1.019" target="_blank">10.1.1.019</a><br>

/opt/mpich2/bin/mpd<br>

My machine model is as following :<br>

processor : 0<br>

vendor_id : GenuineIntel<br>

cpu family : 6<br>

model : 23<br>

model name : Intel(R) Xeon(R) CPU E5410 @ 2.33GHz<br>

stepping : 10<br>

cpu MHz : 2327.489<br>

cache size : 6144 KB<br>

physical id : 0<br>

siblings : 4<br>

core id : 0<br>

cpu cores : 4<br>

apicid : 0<br>

initial apicid : 0<br>

fpu : yes<br>

fpu_exception : yes<br>

cpuid level : 13<br>

wp : yes<br>

flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov<br>

pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant<br>

_tsc arch_perfmon pebs bts rep_good nopl pni monitor ds_cpl vmx est tm2 ssse3 cx<br>

16 xtpr dca sse4_1 lahf_lm<br>

bogomips : 4654.97<br>

...............<br>

The pwscf version is espresso-4.0.3 .<br>

So , I want to know why the usage is so slow ?<br>

How can I solve this problem ? Anybody who meet the same problem ?<br>

Your kind help will be appreciated !<br>

Best regards<br>

Q.J.Wang<br>

XiangTan University<br>

<br>

</blockquote></div></div>

Some possibilities are:<br>

<br>

1) <a href="http://www.quantum-espresso.org/wiki/index.php/Frequently_Asked_Questions#Why_is_my_parallel_job_running_in_such_a_lousy_way.3F" target="_blank">http://www.quantum-espresso.org/wiki/index.php/Frequently_Asked_Questions#Why_is_my_parallel_job_running_in_such_a_lousy_way.3F</a><br>


2) you are running a job over a very slow network (interconnecting the different nodes)<br>

3) you are trying to force a node to write on a disk which is local to other nodes and connected to it<br>

through a very-low performance network<br>

4) your job requests much more resources than you have, e.g. RAM memory (causing a node to swap). From<br>

your data it seems you are using all the 8Gb mem, actually the same test should be done on every node.<br>

<br>

<br>

How are distributed the 16 CPU? Over more than one node?<br>

<br>

Giovanni<br>

<br>

-- <br>

<br>

<br>

<br>

Dr. Giovanni Cantele<br>

Coherentia CNR-INFM and Dipartimento di Scienze Fisiche<br>

Universita&#39; di Napoli &quot;Federico II&quot;<br>

Complesso Universitario di Monte S. Angelo - Ed. 6<br>

Via Cintia, I-80126, Napoli, Italy<br>

Phone: +39 081 676910<br>

Fax:   +39 081 676346<br>

E-mail: <a href="mailto:giovanni.cantele@cnr.it" target="_blank">giovanni.cantele@cnr.it</a><br><font color="#888888">

       <a href="mailto:giovanni.cantele@na.infn.it" target="_blank">giovanni.cantele@na.infn.it</a><br>

Web: <a href="http://people.na.infn.it/%7Ecantele" target="_blank">http://people.na.infn.it/~cantele</a><br>

Research Group: <a href="http://www.nanomat.unina.it" target="_blank">http://www.nanomat.unina.it</a><br>

Skype contact: giocan74<br>

<br>

</font><br>_______________________________________________<br>

Pw_forum mailing list<br>

<a href="mailto:Pw_forum@pwscf.org">Pw_forum@pwscf.org</a><br>

<a href="http://www.democritos.it/mailman/listinfo/pw_forum" target="_blank">http://www.democritos.it/mailman/listinfo/pw_forum</a><br>

<br></blockquote></div><br><br clear="all"><br>-- <br>--------------------------------------------------<br>Duy Le<br>PhD Student<br>Department of Physics<br>University of Central Florida.<br>