There is such an option in the 1.7 series and on the trunk, but I don't see it in v1.6.
-mca orte_abort_non_zero_exit 0
On May 30, 2013, at 3:40 AM, Victor Vysotskiy <Victor.Vysotskiy_at_[hidden]> wrote:
> Dear OpenMPI Developers and Users,
> I have general question on signal trapping/handling within mpiexec/mpirun. Let me assume that I have 2 cores and I start two different (independent) prog1 and prog2 programs in parallel via the mpirun/mpiexec strartup command:
> mpiexec -n 1 prog1 : -n 1 prog2
> What happens if one of the programs just is abnormally crashed/terminated while the second one is still running normally? Is it correct observation that in such case the OpenMPI immediately starts a cleanup process and automatically terminates all spawned/running jobs? If it is like that, is there any way to force mpiexec/mpirun to don't cleanup all processes on error and wait until all spawned processes either successfully complete or abnormally terminate their execution?
> Thank you in advance!
> users mailing list