Hello !

I am working on some simulations where I have to perform periodic kill-restart and checkpointing on a MPI application.

As a checkpoint can take place immediately after restart I need some way to know whether ompi-restart of the application is complete.
If I do not ensure that restart of all application processes is complete, ompi-checkpoint fails after throwing a slew of errors.

Can someone please suggest an idea for having some kind of notification indicating restarts have complete (in the sense that checkpointing .

Thank you,
Kishor