Hello, all -


I have an OpenMPI application that generates a file while it runs.  No big deal.  However, I’d like to delete the partial file if the job is aborted via a user signal.  In a non-MPI application, I’d use sigaction to intercept the SIGTERM and delete the open files there.  I’d then call the “old” signal handler.   When I tried this with my OpenMPI program, the signal was caught, the files deleted, the processes exited, but the MPI exec command as a whole did not exit.   This is the technique, by the way, that was described in this IBM MPI document:




My question is, what is the “right” way to do this under OpenMPI?  The only way I got the thing to work was by resetting the sigaction to the old handler and re-raising the signal.  It seems to work, but I want to know if I am going to get “bit” by this.  Specifically, am I “closing” MPI correctly by doing this?


I am running OpenMPI 1.2.5 under Fedora 8 on Linux in a x86_64 environment.   My compiler is gcc 4.1.2.  This behavior happens when all processes are running on the same node using shared memory and between nodes when using TCP transport.  I don’t have access to any other transport.


Thanks for your help.


Jesse Keller

454 Life Sciences


Here’s a code snippet to demonstrate what I’m talking about.




struct sigaction sa_old_term;  /* Global. */



SIGTERM_handler(int signal , siginfo_t * siginfo , void * a)


    UnlinkOpenedFiles(); /* Global function to delete partial files. */

    /* The commented code doesn’t work. */

    //if (sa_old_term.sa_sigaction)


    //      sa_old_term.sa_flags =SA_SIGINFO;

    //      (*sa_old_term.sa_sigaction)(signal,siginfo,a);


    sigaction(SIGTERM, &sa_old_term,NULL);




int main( int argc, char * argv)


    MPI::Init(argc, argv);


    struct sigaction sa_term;


    sa_term.sa_flags = SA_SIGINFO;

    sa_term.sa_sigaction = SIGTERM_handler;

    sigaction(SIGTERM, &sa_term, &sa_old_term);




   return 0;