Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RFC: Resilient ORTE
From: Wesley Bland (wbland_at_[hidden])
Date: 2011-06-18 09:31:40


That's fine. Let's say Thursday COB is now the timeout.
On Jun 18, 2011 9:10 AM, "Joshua Hursey" <jjhursey_at_[hidden]> wrote:
> Cool. Then can we hold off pushing this into the trunk for a couple days
until I get a chance to test it? Monday COB does not give me much time since
we just got the new patch on Friday COB (the RFC gave us 2 weeks to review
the original patch). Would waiting until next Thursday/Friday COB be too
disruptive? That should give me and maybe Ralph enough time to test and send
any further feedback.
>
> Thanks,
> Josh
>
> On Jun 17, 2011, at 5:59 PM, Wesley Bland wrote:
>
>> I believe that it does. I made quite a few changes in the last checkin
though I didn't run your specific test this afternoon. I'll be able to try
it later this evening but it should be easy to test now that it's synced
with the trunk again.
>>
>> On Jun 17, 2011 5:32 PM, "Josh Hursey" <jjhursey_at_[hidden]> wrote:
>> > Does this include a fix for the problem I reported with mpirun-hosted
processes?
>> >
>> > If not I would ask that we holding off on putting it into the trunk
>> > until that particular bug is addressed. From my experience tackling
>> > this particular issues requires some code refactoring, which should
>> > probably be done once in the trunk instead of two possibly disruptive
>> > commits.
>> >
>> > -- Josh
>> >
>> > On Fri, Jun 17, 2011 at 5:18 PM, Wesley Bland <wbland_at_[hidden]>
wrote:
>> >> This is a reminder that the Resilient ORTE RFC is set to go into the
trunk
>> >> on Monday at COB.
>> >> I've updated the code with a few of the changes that were mentioned on
and
>> >> off the list (moved code out of orted_comm.c, errmgr_set_callback
returns
>> >> previous callback, post_startup function, corrected normal termination
>> >> issues). Please take another look at it if you have any interest. The
code
>> >> can be found here:
>> >> https://bitbucket.org/wesbland/resilient-orte/
>> >> Thanks,
>> >> Wesley Bland
>> >
>> >
>> >
>> > --
>> > Joshua Hursey
>> > Postdoctoral Research Associate
>> > Oak Ridge National Laboratory
>> > http://users.nccs.gov/~jjhursey
>