[Pw_forum] Crash when running pw.x for relaxing a structure
Huiqun Zhou
hqzhou at nju.edu.cn
Wed Aug 2 08:04:09 CEST 2006
Paolo,
Thanks for your response. I have made more calculations these days. Below
are
some facts:
(1) The system I'm investigating is with orthohombic unit cell. The volume
is fixed
at 1570.0 bohr^3, and the calculations were carried on by changing
b/a = 0.275 - 0.400 : 0.025
c/a = 0.975 - 1.200 : 0.025
(2) Not all calculations will fail at charge density extrapolation. For
example, when
b/a = 0.375, the calculation will fail only when c/a = 1.200. This error
is highly
reproducible.
(3) The error occurrs only in the calculation on 4 CPU cores, and it's no
problem
when running on one or two CPU cores. This is true for all failed cases.
(4) This problem can be duplicated on compute nodes with different dual core
processors, such as Intel dempsey (Xeon 5060), woodcrest (Xeon 5140), and
AMD opteron 280.
The OS of my cluster is RHEL4 U3 (kernel 2.6.9-34). I'm using Intel FORTRAN
9.0, MKL 8.0 and FFTW 2.1.5. I have no access to other commercial compilers,
and failed to compile QE with g95.
The attached zip file includes input for b/a=0.375, c/a=1.200, and output
files of the
results on 1 core (successful) and 4 cores (failure).
Thanks again for your help.
Huiqun Zhou
----- Original Message -----
From: "Paolo Giannozzi" <giannozz at nest.sns.it>
To: <pw_forum at pwscf.org>
Sent: Tuesday, July 25, 2006 12:25 AM
Subject: Re: [Pw_forum] Crash when running pw.x for relaxing a structure
> On Tuesday 18 July 2006 12:21, Huiqun Zhou wrote:
>
>> I'm doing structural optimization for chromite with calcium ferrite
>> structure while changing b/a and c/a at fixed volume. But for every
>> run with different pair of b/a and c/a, I alway got following error
>> after 3-5 rounds of SCF calculations:
>
>> Writing output data file fecr2o4-cf-relax.save
>>
>> second order charge density extrapolation
>> rank 1 in job 170 woodcrest_32906 caused collective abort of all ranks
>> exit status of rank 1: return code 220
>
>> The job was running parallely on one compute node with 4 CPU cores
>> (Intel woodcrest).
>>
>> Did I do anything wrong?
>
> difficult to say. Is it reproducible? does it happen on other machines
> or with other compilers or in serial execution ? If it is not reproducible
> it may not be related to the code itself
>
> P.
> --
> Paolo Giannozzi Phone: +39/050-509876
> DEMOCRITOS and SNS Fax: +39/050-563513
> Piazza dei Cavalieri 7 I-56126 Pisa, Italy
> _______________________________________________
> Pw_forum mailing list
> Pw_forum at pwscf.org
> http://www.democritos.it/mailman/listinfo/pw_forum
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fecr2o4-cf.zip
Type: application/octet-stream
Size: 25406 bytes
Desc: not available
Url : /pipermail/attachments/20060802/54688685/attachment.obj
More information about the Pw_forum
mailing list