[Pw_forum] GIPAW: error in output
Lorenzo Paulatto
Lorenzo.Paulatto at impmc.upmc.fr
Tue Oct 25 18:57:41 CEST 2011
Recent versions of gfortran have a -traceback option that (together with
-g, I think) would give you the (almost) exact code line that causes the
problem; without that you shoudl still be able to run
addr2line -e /wherever/bin/gipaw.x 0x44e576
to get the line of cgsolve_all that triggers a NaN or inf. However, I do
not know the gipaw code, maybe Davide can guess something from that.
cheers
On Tue, 25 Oct 2011 18:54:30 +0200, Carlo Nervi <carlo.nervi at unito.it>
wrote:
> Il 25/10/2011 18.22, Lorenzo Paulatto ha scritto:
>> You could try to compile using this option (all together, without
>> spaces):
>> -ffpe-trap=invalid,zero,overflow,underflow,denormal
>> to force the code to crash at the first appearance of NaN, this could
>> help
>> track down the source of the problem.
>
> Thank you Lorenzo. I did what you suggested and I got many errors like
> this:
>
> [chpc111:47070] Signal: Floating point exception (8)
> [chpc111:47070] Signal code: Floating point divide-by-zero (3)
> [chpc111:47070] Failing at address: 0x44e576
> [chpc111:47092] [ 0] /lib64/libpthread.so.0(+0x10eb0) [0x2b972490deb0]
> [chpc111:47092] [ 1]
> /home/nervi/src/QE432/espresso-4.3.2/GIPAW/src/gipaw.x(cgsolve_all_+0xff6)
> [0x44e576]
> [chpc111:47092] [ 2]
> /home/nervi/src/QE432/espresso-4.3.2/GIPAW/src/gipaw.x(greenfunction_+0x1280)
> [0x44d1a0]
> [chpc111:47092] [ 3]
> /home/nervi/src/QE432/espresso-4.3.2/GIPAW/src/gipaw.x(paramagnetic_correction_aug_+0x1e30)
> [0x4718f0]
> [chpc111:47092] [ 4]
> /home/nervi/src/QE432/espresso-4.3.2/GIPAW/src/gipaw.x(suscept_crystal_+0x520f)
> [0x45c02f]
> [chpc111:47092] [ 5]
> /home/nervi/src/QE432/espresso-4.3.2/GIPAW/src/gipaw.x(main+0x1a7)
> [0x43d227]
> [chpc111:47092] [ 6] /lib64/libc.so.6(__libc_start_main+0xed)
> [0x2b9724b3db0d]
> [chpc111:47092] [ 7]
> /home/nervi/src/QE432/espresso-4.3.2/GIPAW/src/gipaw.x() [0x43d2fd]
> [chpc111:47092] *** End of error message ***
> [chpc111:47105] *** Process received signal ***
>
> and
>
> [chpc111:47076] *** End of error message ***
> [chpc111:47102] [ 3]
> /home/nervi/src/QE432/espresso-4.3.2/GIPAW/src/gipaw.x(paramagnetic_correction_aug_+0x1e30)
> [0x4718f0]
> [chpc111:47102] [ 4]
> /home/nervi/src/QE432/espresso-4.3.2/GIPAW/src/gipaw.x(suscept_crystal_+0x520f)
> [0x45c02f]
> [chpc111:47102] [ 5]
> /home/nervi/src/QE432/espresso-4.3.2/GIPAW/src/gipaw.x(main+0x1a7)
> [0x43d227]
> [chpc111:47102] [ 6] /lib64/libc.so.6(__libc_start_main+0xed)
> [0x2ac89299db0d]
> [chpc111:47102] [ 7]
> /home/nervi/src/QE432/espresso-4.3.2/GIPAW/src/gipaw.x() [0x43d2fd]
> [chpc111:47102] *** End of error message ***
>
>
> But if I run mpirun -n 1 I got the following:
>
> Fortran runtime warning: IEEE 'denormal number' exception not supported.
> At line 739 of file suscept_crystal.f90 (unit = 99, file =
> '/tmp/ceresoli/benzene.gipaw_recover')
> Fortran runtime error: I/O past end of record on unformatted file
> --------------------------------------------------------------------------
> mpirun has exited due to process rank 0 with PID 47850 on
> node chpc111 exiting without calling "finalize". This may
> have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here).
> --------------------------------------------------------------------------
>
>
> Any hints?
> Carlo
>
--
Lorenzo Paulatto IdR @ IMPMC/CNRS & Université Paris 6
phone: +33 (0)1 44275 084 / skype: paulatz
www: http://www-int.impmc.upmc.fr/~paulatto/
mail: 23-24/4é16 Boîte courrier 115, 4 place Jussieu 75252 Paris Cédex 05
More information about the Pw_forum
mailing list