[Pw_forum] Inconsistent behaviour of check_stop_init

Riccardo Di Meo dimeo at democritos.it
Fri Feb 5 13:21:48 CET 2010


Lorenzo Paulatto wrote:
> On Fri, 05 Feb 2010 12:33:33 +0100, Riccardo Di Meo <dimeo at democritos.it>  
> wrote:
>   
>> it is particularly annoying when the code is to be
>> inserted in an automated script (e.g. for the execution within a portal).
>>     
> Hi Riccardo,
> I think your solution (a) is more consistent, however it is not up to me  
> to decide. If you are interested, a few months ago, I used the  
> long-abandoned-but-still-working posix90 library to implement signal  
> handling in a private version of QE. This allowed for a "clean" stop of  
> the code (i.e. at the next save point) by simply sending a signal to the  
> program; something like pkill -USR1 pw.x or even pressing CTRL-C if  
> running interactively.
>   

Signal trapping would be a really nice addition imo, and I'm wondering 
if there's a "de facto" standard library to make Fortran codes signal 
aware (gfortran implements such feature, but what about other 
compilers?); sadly due to the fact that your patch uses an abandoned 
library, I cannot use it on my project...

> My primary aim was to avoid the use of max_seconds (which I always forget  
> to set properly) configuring instead the queue system to send a SIGTERM a  
> few minutes before the time is up. Unluckily signal handling does not go  
> along very well with mpi, so I've never committed the changes, but I  
>   
> should still have a working copy somewhere, if anybody is interested.
>   

this is a bit strange... I mean: the fact that MPI doesn't appreciate 
signals being thrown at it, is understandable, however I wonder if this 
problem can be worked out... If I'll find the aforementioned signal 
trapping library, I'll maybe give it a look.

As far as I have understood, It would just requires an upgrade of the 
check_stop module.


Cheers,
Riccardo




More information about the Pw_forum mailing list