Hello, all -
I have an OpenMPI application that generates a file while it runs. No
big deal. However, I'd like to delete the partial file if the job is
aborted via a user signal. In a non-MPI application, I'd use sigaction
to intercept the SIGTERM and delete the open files there. I'd then call
the "old" signal handler. When I tried this with my OpenMPI program,
the signal was caught, the files deleted, the processes exited, but the
MPI exec command as a whole did not exit. This is the technique, by
the way, that was described in this IBM MPI document:
http://publib.boulder.ibm.com/infocenter/clresctr/vxrx/index.jsp?topic=/
com.ibm.cluster.pe.doc/pe_linux42/am106l0037.html
My question is, what is the "right" way to do this under OpenMPI? The
only way I got the thing to work was by resetting the sigaction to the
old handler and re-raising the signal. It seems to work, but I want to
know if I am going to get "bit" by this. Specifically, am I "closing"
MPI correctly by doing this?
I am running OpenMPI 1.2.5 under Fedora 8 on Linux in a x86_64
environment. My compiler is gcc 4.1.2. This behavior happens when all
processes are running on the same node using shared memory and between
nodes when using TCP transport. I don't have access to any other
transport.
Thanks for your help.
Jesse Keller
454 Life Sciences
Here's a code snippet to demonstrate what I'm talking about.
------------------------------------------------------------------------
----------------------------
struct sigaction sa_old_term; /* Global. */
void
SIGTERM_handler(int signal , siginfo_t * siginfo , void * a)
{
UnlinkOpenedFiles(); /* Global function to delete partial files. */
/* The commented code doesn't work. */
//if (sa_old_term.sa_sigaction)
//{
// sa_old_term.sa_flags =SA_SIGINFO;
// (*sa_old_term.sa_sigaction)(signal,siginfo,a);
//}
sigaction(SIGTERM, &sa_old_term,NULL);
raise(signal);
}
int main( int argc, char * argv)
{
MPI::Init(argc, argv);
struct sigaction sa_term;
sigemptyset(&sa_term.sa_mask);
sa_term.sa_flags = SA_SIGINFO;
sa_term.sa_sigaction = SIGTERM_handler;
sigaction(SIGTERM, &sa_term, &sa_old_term);
doSomeMPIComputation();
MPI::Finalize();
return 0;
}
|