Dear all,
I'm trying to handle signals inside a MPI task farming model. Following is
a pseudo-code of what i'm trying to achieve:
volatile sig_atomic_t unexpected_error_occurred = 0;
void my_handler( int sig ){
unexpected_error_occurred = 1;}
//// somewhere in the code...//
signal(SIGTERM, my_handler);
if (root process) {
// do stuff
if ( unexpected_error_occurred ) {
// save something
// reraise the SIGTERM again, but now with the default handler
signal(SIGTERM, SIG_DFL);
raise(SIGTERM);
}}else { // slave process
// do different stuff
if ( unexpected_error_occurred ) {
// just propragate the signal to the root
signal(SIGTERM, SIG_DFL);
raise(SIGTERM);
}}
signal(SIGTERM, SIG_DFL); // reassign default handler
// continues the code...
As can be seen, the signal handling is required for implementing a restart
feature. All the problem resides in the assumption i made that all
processes in the communicator will receive a SIGTERM as a side effect. Is
it a valid assumption? How the actual MPI implementation deals with such
scenarios?
I also tried to replace all the raise() calls by MPI_Abort(), which
according to the documentation (
http://www.open-mpi.org/doc/v1.5/man3/MPI_Abort.3.php), sends a SIGTERM to
all associated processes. The undesired behaviour persists: when killing a
slave process, the save section in the root branch is not executed.
Appreciate any help,
Júlio.
|