Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RFC: Resilient ORTE
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2011-06-09 13:02:28


As long as there is the ability to remove and replace a callback I'm
fine. I personally think that forcing the errmgr to track ordering of
callback registration makes it a more complex solution, but as long as
it works.

In particular I need to replace the default 'abort' errmgr call in
OMPI with something else. If both are called, then this does not help
me at all - since the abort behavior will be activated either before
or after my callback. So can you explain how I would do that with the
current or the proposed interface?

-- Josh

On Thu, Jun 9, 2011 at 12:54 PM, Ralph Castain <rhc_at_[hidden]> wrote:
> I agree - let's not get overly complex unless we can clearly articulate a
> requirement to do so.
>
> On Thu, Jun 9, 2011 at 10:45 AM, George Bosilca <bosilca_at_[hidden]>
> wrote:
>>
>> This will require exactly opposite registration and de-registration order,
>> or no de-registration at all (aka no way to unload a component). Or some
>> even more complex code to deal with internally.
>>
>> If the error manager handle the callbacks it can use the registration
>> ordering (which will be what the the approach can do), and can enforce that
>> all callbacks will be called. I would rather prefer this approach.
>>
>>  george.
>>
>> On Jun 9, 2011, at 08:36 , Josh Hursey wrote:
>>
>> > I would prefer returning the previous callback instead of relying on
>> > the errmgr to get the ordering right. Additionally, when I want to
>> > unregister (or replace) a call back it is easy to do that with a
>> > single interface, than introducing a new one to remove a particular
>> > callback.
>> > Register:
>> >  ompi_errmgr.set_fault_callback(my_callback, prev_callback);
>> > Deregister:
>> >  ompi_errmgr.set_fault_callback(prev_callback, old_callback);
>> > or to eliminate all callbacks (if you needed that for somme reason):
>> >  ompi_errmgr.set_fault_callback(NULL, old_callback);
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey