Is there any way to automatically checkpoint/restart an application in OpenMPI? This is, checkpointing the application without using the command ompi-checkpoint, perhaps via a function call in the application's code itself. The same with the restart after a failure.
On a related note, what is the default behavior of an OpenMPI application after one process fails? Does the runtime shut down the whole application?