Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

From: Daniel Spångberg (daniels_at_[hidden])
Date: 2007-08-16 12:04:07

Dear Open-MPI user list members,

I am currently having a user with an application where one of the
MPI-processes die, but the openmpi-system does not kill the rest of the

Since the mpirun man page states the following I would expect it to take
care of killing the application if a process exits without calling

    Process Termination / Signal Handling
        During the run of an MPI application, if any rank dies abnormally
(either exiting before invoking MPI_FINALIZE, or dying as the
        result of a signal), mpirun will print out an error message and
kill the rest of the MPI application.

The following test program demonstrates the behaviour (program hangs until
it is killed by the user or batch system):

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <mpi.h>

#define RANK_DEATH 1

int main(int argc, char **argv)
   int rank;

   if (rank==RANK_DEATH)
   return 0;

I have tested this on openmpi 1.2.1 as well as the latest stable 1.2.3. I
am on Linux x86_64.

Is this a bug, or are there some flags I can use to force the mpirun (or
orted, or...) to kill the whole MPI program when this happens?

If one of the application processes die from a signal (I have tested SEGV
and FPE) rather than just exiting the whole application is indeed killed.

Best regards
Daniel Spångberg