On Feb 25, 2013, at 10:38 AM, Bokassa <bokassa_at_[hidden]> wrote:
> I noticed that MPI_Abort() does not abort the tasks if the mpi program is started using srun.
> I call MPI_Abort() from rank 0, this process exit, but the other ranks keep running or waiting for IO
> on the other nodes. The only way to kill the job is to use scancel.
> However if I use mpirun under a slurm allocation then MPI_Abort() works as expected aborting
> all tasks.
> Is this a known issue?
What version of OMPI are you using? Slurm should detect the process failure and kill the job, unless it was configured not to do so.
> Thanks, David
> users mailing list