Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Use unique collective ids for the checkpoint/restart code
From: Adrian Reber (adrian_at_[hidden])
Date: 2014-02-04 09:16:34


Thanks for spotting the 'printf'. I removed it as it was for debugging
in a very early stage. I committed the patch without the 'printf' to svn.

                Adrian

On Mon, Feb 03, 2014 at 12:42:39PM -0800, Ralph Castain wrote:
> Looks okay to me - I see you left a "printf" statement in plm_base_launch_support.c, so you might want to make that an opal_output_verbose or something.
>
> On Feb 3, 2014, at 12:19 PM, Adrian Reber <adrian_at_[hidden]> wrote:
>
> > This patch
> >
> > https://lisas.de/git/?p=open-mpi.git;a=commitdiff;h=14ec7f42baab882e345948ff79c4f75f5084bbbf
> >
> > introduces unique collective ids for the checkpoint/restart code and
> > with this applied it seems to work pretty good. As this patch also
> > touches non-CR code it would be good if someone could have a look at it.
> >
> > With this patch applied the code seems to work up to the point where
> > orterun actually pauses all processes and tries to create the
> > checkpoints. The checkpoint creation does not work for me as CRS does
> > not yet include support for checkpoint/restart using CRIU which would be
> > my next step.
> >
> > Adrian
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel