Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] MPI_Barrier in Self-checkpointing call
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2012-02-15 10:56:41

When you receive that callback the MPI has ben put in a quiescent state. As
such it does not allow MPI communication until the checkpoint is completely
finished. So you cannot call barrier in the checkpoint callback. Since Open
MPI did doing a coordinated checkpoint, you can assume that all processes
are calling the same callback at about the same time (the coordination
algorithm synchronizes them for you)

If you would like a notification callback before the quiescence protocol
you might want to look at the INC callbacks:
They are available in the Open MPI trunk (v1.7). The
callback will give you immediate notice, and you -should- be able to make
MPI calls in that callback. I have not tried it, but conceptually it should
work. If it does not, I can file a bug ticket and we can look into
addressing it.

-- Josh

On Wed, Feb 15, 2012 at 4:23 AM, Faisal Shahzad <itsfaisi_at_[hidden]>wrote:

> Dear Group,
> I wanted to do a synchronization check with 'MPI_Barrier(MPI_COMM_WORLD)'
> in 'opal_crs_self_user_checkpoint(char **restart_cmd)' call. Although every
> process is present in this call, it fails to synchronize. Is there any
> reason why cant we use barrier?
> Thanks in advance.
> Kind regards,
> Faisal
> _______________________________________________
> users mailing list
> users_at_[hidden]

Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory