Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] MPI_Barrier in Self-checkpointing call
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2012-02-15 10:56:41


When you receive that callback the MPI has ben put in a quiescent state. As
such it does not allow MPI communication until the checkpoint is completely
finished. So you cannot call barrier in the checkpoint callback. Since Open
MPI did doing a coordinated checkpoint, you can assume that all processes
are calling the same callback at about the same time (the coordination
algorithm synchronizes them for you)

If you would like a notification callback before the quiescence protocol
you might want to look at the INC callbacks:
  http://osl.iu.edu/research/ft/ompi-cr/api.php#api-cr_inc_register_callback
They are available in the Open MPI trunk (v1.7). The
OMPI_CR_INC_PRE_CRS_PRE_MPI
callback will give you immediate notice, and you -should- be able to make
MPI calls in that callback. I have not tried it, but conceptually it should
work. If it does not, I can file a bug ticket and we can look into
addressing it.

-- Josh

On Wed, Feb 15, 2012 at 4:23 AM, Faisal Shahzad <itsfaisi_at_[hidden]>wrote:

> Dear Group,
>
> I wanted to do a synchronization check with 'MPI_Barrier(MPI_COMM_WORLD)'
> in 'opal_crs_self_user_checkpoint(char **restart_cmd)' call. Although every
> process is present in this call, it fails to synchronize. Is there any
> reason why cant we use barrier?
> Thanks in advance.
>
> Kind regards,
> Faisal
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey