Jeff, you were right. I did a series of Spawns and consecutive Merges
and forgot to set the exception handler with the newly created
intra-communicators. Since these properties obviously are not inherited
(which would be kind of hard considering that there are multiple
communicators to be merged), the default non-exception-throwing handler
Jeff Squyres schrieb:
On Nov 7, 2007, at 7:43 PM, Murat Knecht wrote:
when MPI_Spawn cannot launch an application for whatever reason, the
entire job is cancelled with some message like the following.
That is correct; MPI states that the default error handler is
Is there a way to handle this nicely, e.g. by throwing an exception? I
Sure; change the default error handler on the communicator in which
you are using in the call to COMM_SPAWN.
I don't know if we have checked this particular code path to ensure
that OMPI will be stable after this, but it might work...
understand, this does not work, when the job is first started with
mpirun, as there is no application yet to fall back on, but in case
running application, it should be possible to simply inform it that
spawning request failed. Then the application could begin to handle
error and terminate gracefully. I did enable C++ Exceptions btw, so I
guess this is not implemented. Is there a technical (e.g.
reason behind this, or simply a yet-to-be-added feature?
The MPI layer is written in C; it will not throw exceptions unless you
use the MPI C++ bindings to enable the MPI::ERRORS_THROW_EXCEPTIONS
error handler. Also be sure to use the right compiler flags to enable
the C compiler to propagate C++ exceptions when you configure/build
Open MPI via the --enable-cxx-exceptions flag (it's not enabled by
default because it imposes a slight performance penalty).