It should work - check the following srun option:

       -K, --kill-on-bad-exit[=0|1]
              Controls whether or not to terminate a job if any task exits with a non-zero exit code. If this option is not specified, the default action  will  be
              based  upon  the  SLURM  configuration parameter of KillOnBadExit. If this option is specified, it will take precedence over KillOnBadExit. An option
              argument of zero will not terminate the job. A non-zero argument or no argument will terminate the job.  Note: This option takes precedence over  the
              -W, --wait option to terminate the job immediately if a task exits with a non-zero exit code.

My guess is that your configuration parameter for KillOnBadExit has not been specified, or you aborted with a zero status.


On Feb 26, 2013, at 9:08 AM, Bokassa <bokassa@gmail.com> wrote:

Hi Ralph, thanks for your answer. I am using:

>mpirun --version
mpirun (Open MPI) 1.5.4


and slurm 2.5.

Should I try to upgrade to 1.6.5?



/David/Bigagli
www.davidbigagli.com


On Mon, Feb 25, 2013 at 7:38 PM, Bokassa <bokassa@gmail.com> wrote:
Hi, 
   I noticed that MPI_Abort() does not abort the tasks if the mpi program is started using srun.
I call MPI_Abort() from rank 0, this process exit, but the other ranks keep running or waiting for IO 
on the other nodes. The only way to kill the job is to use scancel. 
However if I use mpirun under a slurm allocation then MPI_Abort() works as expected aborting 
all tasks.

Is this a known issue?

Thanks, David


_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users