Dear all,
I'm trying to handle signals inside a MPI task farming model. Following is a pseudo-code of what i'm trying to achieve:
volatile sig_atomic_t unexpected_error_occurred = 0;
void my_handler( int sig )
{
unexpected_error_occurred = 1;
}
//
// somewhere in the code...
//
signal(SIGTERM, my_handler);
if (root process) {
// do stuff
if ( unexpected_error_occurred ) {
// save something
// reraise the SIGTERM again, but now with the default handler
signal(SIGTERM, SIG_DFL);
raise(SIGTERM);
}
}
else { // slave process
// do different stuff
if ( unexpected_error_occurred ) {
// just propragate the signal to the root
signal(SIGTERM, SIG_DFL);
raise(SIGTERM);
}
}
signal(SIGTERM, SIG_DFL); // reassign default handler
// continues the code...
As can be seen, the signal handling is required for implementing a restart feature. All the problem resides in the assumption i made that all processes in the communicator will receive a SIGTERM as a side effect. Is it a valid assumption? How the actual MPI implementation deals with such scenarios?
I also tried to replace all the raise() calls by MPI_Abort(), which according to the documentation (
http://www.open-mpi.org/doc/v1.5/man3/MPI_Abort.3.php), sends a SIGTERM to all associated processes. The undesired behaviour persists: when killing a slave process, the save section in the root branch is not executed.
Appreciate any help,
Júlio.
_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users