Hello, all -

 

I have an OpenMPI application that generates a file while it runs.  No big deal.  However, I’d like to delete the partial file if the job is aborted via a user signal.  In a non-MPI application, I’d use sigaction to intercept the SIGTERM and delete the open files there.  I’d then call the “old” signal handler.   When I tried this with my OpenMPI program, the signal was caught, the files deleted, the processes exited, but the MPI exec command as a whole did not exit.   This is the technique, by the way, that was described in this IBM MPI document:

 

http://publib.boulder.ibm.com/infocenter/clresctr/vxrx/index.jsp?topic=/com.ibm.cluster.pe.doc/pe_linux42/am106l0037.html

 

My question is, what is the “right” way to do this under OpenMPI?  The only way I got the thing to work was by resetting the sigaction to the old handler and re-raising the signal.  It seems to work, but I want to know if I am going to get “bit” by this.  Specifically, am I “closing” MPI correctly by doing this?

 

I am running OpenMPI 1.2.5 under Fedora 8 on Linux in a x86_64 environment.   My compiler is gcc 4.1.2.  This behavior happens when all processes are running on the same node using shared memory and between nodes when using TCP transport.  I don’t have access to any other transport.

 

Thanks for your help.

 

Jesse Keller

454 Life Sciences

 

Here’s a code snippet to demonstrate what I’m talking about.

 

----------------------------------------------------------------------------------------------------

 

struct sigaction sa_old_term;  /* Global. */

 

void

SIGTERM_handler(int signal , siginfo_t * siginfo , void * a)

{

    UnlinkOpenedFiles(); /* Global function to delete partial files. */

    /* The commented code doesn’t work. */

    //if (sa_old_term.sa_sigaction)

    //{

    //      sa_old_term.sa_flags =SA_SIGINFO;

    //      (*sa_old_term.sa_sigaction)(signal,siginfo,a);

    //}

    sigaction(SIGTERM, &sa_old_term,NULL);

    raise(signal);

}

 

int main( int argc, char * argv)

{

    MPI::Init(argc, argv);

   

    struct sigaction sa_term;

    sigemptyset(&sa_term.sa_mask);

    sa_term.sa_flags = SA_SIGINFO;

    sa_term.sa_sigaction = SIGTERM_handler;

    sigaction(SIGTERM, &sa_term, &sa_old_term);

 

   doSomeMPIComputation();

   MPI::Finalize();

   return 0;

}