Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Question regarding SELF-checkpointing
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2011-08-31 11:35:55

That seems like a bug to me.

What version of Open MPI are you using? How have you setup the C/R
functionality (what MCA options do you have set, what command line
options are you using)? Can you send a small reproducing application
that we can test against?

That should help us focus in on the problem a bit.

-- Josh

On Wed, Aug 31, 2011 at 6:36 AM, Faisal Shahzad <itsfaisi_at_[hidden]> wrote:
> Dear Group,
> I have a mpi-program in which every process is communicating with its
> neighbors. When SELF-checkpointing, every process writes to a separate file.
> Problem is that sometimes after making a checkpoint, program does not
> continue again. Having more number of processes makes this problem severe.
> With just 1 process (no communication), SEFL-checkpointing works normally
> with no problem.
> I have tried different '--mca btl' parameters (openib,tcp,sm,self), but
> problem persists.
> I would very much appreciate your support regarding it.
> Kind regards,
> Faisal
> _______________________________________________
> users mailing list
> users_at_[hidden]

Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory