[Pw_forum] The low usage of CUPs
Duy Le
ttduyle at gmail.com
Fri Sep 11 16:38:11 CEST 2009
It seems that you provided wrong information:
1. You shown only 8 CPUs. Where are the other 8 as you were talking about 16
CPUs job.
2. There is only one task actually running ( PID 4404, run as root)
3. There's absolutely no other thing running in this node.
My guess is that you took those information from wrong node, probably from
head node.
:)
2009/9/11 Giovanni Cantele <Giovanni.Cantele at na.infn.it>
> wangqj1 wrote:
>
>> Dear pwscf users
>> I use 16 CPUs to run a job,but the usage of CPUs is very slow ,it as like:
>> Tasks: 179 total, 1 running, 178 sleeping, 0 stopped, 0 zombie
>> Cpu0 : 0.0%us, 4.8%sy, 0.0%ni, 90.3%id, 0.3%wa, 0.0%hi, 4.5%si, 0.0%st
>> Cpu1 : 0.0%us, 1.0%sy, 0.0%ni, 95.8%id, 3.3%wa, 0.0%hi, 0.0%si, 0.0%st
>> Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
>> Cpu3 : 0.7%us, 0.0%sy, 0.0%ni, 99.0%id, 0.3%wa, 0.0%hi, 0.0%si, 0.0%st
>> Cpu4 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
>> Cpu5 : 0.0%us, 0.3%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
>> Cpu6 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
>> Cpu7 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
>> Mem: 8048812k total, 7995856k used, 52956k free, 283692k buffers
>> Swap: 4192956k total, 124k used, 4192832k free, 7492420k cached
>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>> 4522 root 15 -5 0 0 0 S 5 0.0 17:59.16 nfsd
>> 2012 root 15 -5 0 0 0 S 1 0.0 6:29.01 kjournald
>> 4404 root 20 0 104m 17m 5044 S 0 0.2 55:07.93 X
>> 4519 root 15 -5 0 0 0 S 0 0.0 18:05.58 nfsd
>> 4521 root 15 -5 0 0 0 S 0 0.0 16:55.33 nfsd
>> 5023 gdm 20 0 235m 31m 11m S 0 0.4 13:39.99 gdm-simple-gree
>> 1 root 20 0 1064 408 348 S 0 0.0 0:02.90 init
>> 2 root 15 -5 0 0 0 S 0 0.0 0:00.02 kthreadd
>> 3 root RT -5 0 0 0 S 0 0.0 0:00.00 migration/0
>> 4 root 15 -5 0 0 0 S 0 0.0 0:00.22 ksoftirqd/0
>>
>> The ifort ,MKL,and mpi I used is :
>> INTFC=/opt/intel/Compiler/11.0/081
>> INTMKL=/opt/intel/mkl/10.1.1.019
>> /opt/mpich2/bin/mpd
>> My machine model is as following :
>> processor : 0
>> vendor_id : GenuineIntel
>> cpu family : 6
>> model : 23
>> model name : Intel(R) Xeon(R) CPU E5410 @ 2.33GHz
>> stepping : 10
>> cpu MHz : 2327.489
>> cache size : 6144 KB
>> physical id : 0
>> siblings : 4
>> core id : 0
>> cpu cores : 4
>> apicid : 0
>> initial apicid : 0
>> fpu : yes
>> fpu_exception : yes
>> cpuid level : 13
>> wp : yes
>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
>> pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm
>> constant
>> _tsc arch_perfmon pebs bts rep_good nopl pni monitor ds_cpl vmx est tm2
>> ssse3 cx
>> 16 xtpr dca sse4_1 lahf_lm
>> bogomips : 4654.97
>> ...............
>> The pwscf version is espresso-4.0.3 .
>> So , I want to know why the usage is so slow ?
>> How can I solve this problem ? Anybody who meet the same problem ?
>> Your kind help will be appreciated !
>> Best regards
>> Q.J.Wang
>> XiangTan University
>>
>> Some possibilities are:
>
> 1)
> http://www.quantum-espresso.org/wiki/index.php/Frequently_Asked_Questions#Why_is_my_parallel_job_running_in_such_a_lousy_way.3F
> 2) you are running a job over a very slow network (interconnecting the
> different nodes)
> 3) you are trying to force a node to write on a disk which is local to
> other nodes and connected to it
> through a very-low performance network
> 4) your job requests much more resources than you have, e.g. RAM memory
> (causing a node to swap). From
> your data it seems you are using all the 8Gb mem, actually the same test
> should be done on every node.
>
>
> How are distributed the 16 CPU? Over more than one node?
>
> Giovanni
>
> --
>
>
>
> Dr. Giovanni Cantele
> Coherentia CNR-INFM and Dipartimento di Scienze Fisiche
> Universita' di Napoli "Federico II"
> Complesso Universitario di Monte S. Angelo - Ed. 6
> Via Cintia, I-80126, Napoli, Italy
> Phone: +39 081 676910
> Fax: +39 081 676346
> E-mail: giovanni.cantele at cnr.it
> giovanni.cantele at na.infn.it
> Web: http://people.na.infn.it/~cantele<http://people.na.infn.it/%7Ecantele>
> Research Group: http://www.nanomat.unina.it
> Skype contact: giocan74
>
>
> _______________________________________________
> Pw_forum mailing list
> Pw_forum at pwscf.org
> http://www.democritos.it/mailman/listinfo/pw_forum
>
>
--
--------------------------------------------------
Duy Le
PhD Student
Department of Physics
University of Central Florida.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.democritos.it/pipermail/pw_forum/attachments/20090911/da0c9790/attachment.htm
More information about the Pw_forum
mailing list