When you receive that callback the MPI has ben put in a quiescent state. As such it does not allow MPI communication until the checkpoint is completely finished. So you cannot call barrier in the checkpoint callback. Since Open MPI did doing a coordinated checkpoint, you can assume that all processes are calling the same callback at about the same time (the coordination algorithm synchronizes them for you)

If you would like a notification callback before the quiescence protocol you might want to look at the INC callbacks:
  http://osl.iu.edu/research/ft/ompi-cr/api.php#api-cr_inc_register_callback
They are available in the Open MPI trunk (v1.7). The OMPI_CR_INC_PRE_CRS_PRE_MPI callback will give you immediate notice, and you -should- be able to make MPI calls in that callback. I have not tried it, but conceptually it should work. If it does not, I can file a bug ticket and we can look into addressing it.

-- Josh

On Wed, Feb 15, 2012 at 4:23 AM, Faisal Shahzad <itsfaisi@hotmail.com> wrote:
Dear Group,

I wanted to do a synchronization check with 'MPI_Barrier(MPI_COMM_WORLD)' in 'opal_crs_self_user_checkpoint(char **restart_cmd)' call. Although every process is present in this call, it fails to synchronize. Is there any reason why cant we use barrier?
Thanks in advance.

Kind regards,
Faisal

_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey