Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RFC: Resilient ORTE
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2011-06-09 08:36:01


On Wed, Jun 8, 2011 at 5:37 PM, Wesley Bland <wbland_at_[hidden]> wrote:
> On Tuesday, June 7, 2011 at 4:55 PM, Josh Hursey wrote:
>
> - orte_errmgr.post_startup() start the persistent RML message. There
> does not seem to be a shutdown version of this (to deregister the RML
> message at orte_finalize time). Was this intentional, or just missed?
>
>  I just missed that one. I've added that into the code now.

Cool.

>
> - in the orte_errmgr.set_fault_callback: it would be nice if it
> returned the previous callback, so you could layer more than one
> 'thing' on top of ORTE and have them chain in a sigaction-like manner.
>
>  Again, you are correct. Rather than just returning the previous callback
> (if any) I think it makes more sense to maintain a list of callbacks and
> have the errmgr call them directly. That way applications/ompi layers don't
> have to worry about calling another callback function.

I would prefer returning the previous callback instead of relying on
the errmgr to get the ordering right. Additionally, when I want to
unregister (or replace) a call back it is easy to do that with a
single interface, than introducing a new one to remove a particular
callback.
Register:
  ompi_errmgr.set_fault_callback(my_callback, prev_callback);
Deregister:
  ompi_errmgr.set_fault_callback(prev_callback, old_callback);
or to eliminate all callbacks (if you needed that for somme reason):
  ompi_errmgr.set_fault_callback(NULL, old_callback);

>
> - orte_process_info.max_procs: this seems to be only used in the
> binomial routed, but I was a bit unclear about its purpose. Can you
> describe what it does, and how it is used?
>
> I use this to determine how many processes were in the job before we started
> having failures. This helps me preserve the structure of the tree as much as
> possible rather than completely reorganizing the routing layer every time a
> process fails.

Sounds fine, I was just curious.

Reorganizing the routing layer after every process failure has some
race issues with multiple rolling failures, so preserving the original
routing tree and rerouting is probably best for this situation. We can
revisit this later for more performance perserving techniques, but not
really something that needs to be addressed now.

>
> - in orted_comm.c: you process the ORTE_PROCESS_FAILED_NOTIFICATION
> message here. Why not push all of that logic into the errmgr
> components? It is not a big deal, just curious.
>
> Most of the actual logic that handles the processing of the error messages
> is pushed into the errmgr component. The code you see in orted_comm.c is
> almost all parsing and resending the list of dead processes to the
> appropriate modules. That code will have to be in there no matter what.
> I've updated the code and checked it into a bitbucket repository which can
> be found here:
> https://bitbucket.org/wesbland/resilient-orte/

Awesome. Thanks,
Josh

> Please let me know of any more comments,
> Wesley
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey