Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] C/R and orte_oob
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2014-02-07 08:14:05


In the original implementation, the OOB ft_event did not do much of
anything on checkpoint preparation and continue. We did not even close the
sockets. However, during restart the OOB will need to renegotiate the
socket connections - usually by calling the finalization function (close
stale sockets) and then the initialization function (create new sockets) of
that component.

I'm not sure if that is still an acceptable approach or not. We will want
the OOB to be quieted across the checkpoint preparation and continue so
that we don't lose any message or have messages cross the checkpoint line.
So something maybe to return to in the next pass.

On Thu, Feb 6, 2014 at 4:45 PM, Ralph Castain <rhc_at_[hidden]> wrote:

>
> On Feb 6, 2014, at 2:16 PM, Adrian Reber <adrian_at_[hidden]> wrote:
>
> > Josh explained it to me a few days ago, that after a checkpoint has been
> > received TCP should no longer be used to not lose any messages. The
> > communication happens over named pipes and therefore (I think) OOB
> > ft_event() is used to quite anything besides the pipes. This all seems
> > to work but I was just confused as the functions for ft_event()
> > in oob/tcp and oob/ud do not seem to contain any functionality.
> >
> > So do I try to fix the ft_event() function in oob/base/ to call the
> > registered ft_event() function which does nothing or do I just remove
> > the call to orte oob ft_event().
>
> Sounds like you'll need to tell the OOB components to stop processing
> messages, so that will require that you insert an event into the system.
> You have to account for two things:
>
> (a) the OOB base and OOB components are operating on the orte_event_base,
> but
>
> (b) each OOB component can have multiple active modules (one per NIC) that
> are operating on their own event base/thread.
>
> So you have to start by pushing an event that calls the OOB base, which
> then loops across the components calling their ft_event interface. Each
> component would then have to create an event for each active module,
> inserting that event into the module's event base/thread. When activated,
> each module would have to shutdown its message engine, and activate another
> event to notify its component that all is quiet.
>
> Once a component finds out that all its modules are quiet, it would then
> have to activate an event to the OOB base. Once the OOB base sees all
> components report quiet, then it would have to activate an event to take
> you to the next step in your process.
>
> In other words, you need to turn the quieting process into its own set of
> states and run it through the state machine. This is the only way to
> guarantee that you'll keep things orderly, and is the major change needed
> in the C/R procedure as it flows thru ORTE. You can't just progress thru a
> set of function calls as you'll inevitably run into a roadblock requiring
> that you wait for an event-driven process to complete.
>
> HTH
> Ralph
>
> >
> > On Thu, Feb 06, 2014 at 10:49:25AM -0800, Ralph Castain wrote:
> >> The only reason I can think of for an OOB ft-event would be to tell the
> OOB to stop sending any messages. You would need to push that into the
> event library and use a callback event to let you know when it was done.
> >>
> >> Of course, once you did that, the OOB would no longer be available to,
> for example, tell the local daemon that the app is ready for checkpoint :-)
> >>
> >> Afraid I'll have to defer to Josh H for any further guidance.
> >>
> >>
> >> On Feb 6, 2014, at 8:15 AM, Adrian Reber <adrian_at_[hidden]> wrote:
> >>
> >>> When I initially made the C/R code compile again I made following
> >>> change:
> >>>
> >>> diff --git a/orte/mca/rml/oob/rml_oob_component.c
> b/orte/mca/rml/oob/rml_oob_component.c
> >>> index f0b22fc..90ed086 100644
> >>> --- a/orte/mca/rml/oob/rml_oob_component.c
> >>> +++ b/orte/mca/rml/oob/rml_oob_component.c
> >>> @@ -185,8 +185,7 @@ orte_rml_oob_ft_event(int state) {
> >>> ;
> >>> }
> >>>
> >>> - if( ORTE_SUCCESS !=
> >>> - (ret = orte_oob.ft_event(state)) ) {
> >>> + if( ORTE_SUCCESS != (ret = orte_rml_oob_ft_event(state)) ) {
> >>> ORTE_ERROR_LOG(ret);
> >>> exit_status = ret;
> >>> goto cleanup;
> >>>
> >>>
> >>>
> >>> This is, of course, wrong. Now the function calls itself in a loop
> until
> >>> it crashes. Looking at orte/mca/oob there is still a ft_event()
> >>> function, but it is disabled using "#if 0". Looking at other functions
> >>> it seems I would need to create something like
> >>>
> >>> #define ORTE_OOB_FT_EVENT(m)
> >>>
> >>> Looking at the modules in orte/mca/oob/ it seems ft_event is
> implemented
> >>> in some places but it never seems to have any real functionality. Is
> >>> ft_event() actually needed there?
> >>>
> >>> Adrian
> >>> _______________________________________________
> >>> devel mailing list
> >>> devel_at_[hidden]
> >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

-- 
Joshua Hursey
Assistant Professor of Computer Science
University of Wisconsin-La Crosse
http://cs.uwlax.edu/~jjhursey