Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [EXTERNAL] Re: [OMPI svn-full] svn:open-mpi r29703 - in trunk: contrib/platform/iu/odin ompi/mca/btl/openib ompi/mca/btl/openib/connect
From: Ralph Castain (rhc_at_[hidden])
Date: 2013-11-14 15:31:09

On Nov 14, 2013, at 12:21 PM, Barrett, Brian W <bwbarre_at_[hidden]> wrote:

> On 11/14/13 1:13 PM, "Joshua Ladd" <joshual_at_[hidden]> wrote:
>> Let me try to summarize my understanding of the situation:
>> 1. Ralph made the OOB asynchronous.
>> 2. OOB cpcs don't work as a result of 1, and are thereby "deprecated",
>> meaning: won't fix.
>> 3. Pasha moved the openib/connect to common/ofacm but excluded the rdmacm
>> in that move. Never changed openib to use ofacm/common.
>> 4. UDCM is "functional" in the trunk, still sitting in openib/connect.
>> But no one is entirely sure if it really works which is why it was
>> disabled in 1.7. Nathan - is there a design doc you can share on this
>> beyond the comments in the code?
>> 5. In order to satisfy the "grand plan":
>> a. UDCM still needs to be moved to common/ofacm.
>> b. OpenIB still needs to be changed to use common/ofacm.
>> c. RDMACM still needs to migrate to common/ofacm.
>> d. XRC support needs to be added to UDCM and put into
>> common/ofacm.
>> 6. The "grand plan" being: move the BTLs into Opal - hence the need to
>> scuttle the OOB cpcs thereby justifying the deprecation and not fixing
>> cpcs after #1.
>> So, that's a quick roundup of how we ended up here (as I understand it.)
>> What needs to be done is:
> That's my understanding as well.
>> 1. Somebody needs to certify/review/ that what Nathan has done is sound.
>> From my perspective, this is a BIG change and needs a comprehensive
>> architecture review. We've been using it in the trunk, and we've been
>> testing it under MTT for some time - but have not deployed or tested at
>> large-scale out in the field. Would be nice to see something on paper in
>> terms of a design doc.
>> 2. Somebody then needs to move UDCM into common/ofacm.
>> 3. Somebody needs to change openib to use common/ofacm cpcs instead of
>> openib/connect cpcs.
>> 4. Somebody needs to move RDMACM into common/ofacm and make sure RoCEE
>> works.
>> 5. Somebody needs to add XRC support to UDCM - whatever that might mean.
>> Given Nathan added UDCM back in 2011 and nobody is really sure it's ready
>> for prime-time, and given Pasha's comments regarding the difference in
>> state machine requirements between the two connection schemes, this
>> doesn't seem like a trivial task.
>> Given Nathan's comments a second ago about ORNL not supporting the IB
>> Offload component, it barely makes sense to keep common/ofacm. And it
>> sounds like the two cpcs presently contained therein are now unusable.
>> All of this work is a result of the Grand Plan to move the BTLs into the
>> Opal layer - which I have no idea what the motive is (I was not involved
>> with OMPI when this was decided or debated.)
>> Basically, without these five changes OpenIB is dead in 1.7.4 and beyond
>> for RC, XRC, and RoCEE. These are blockers to 1.7.4 and I don't believe
>> that the onus falls squarely on Mellanox to fix these. These were
>> community decisions and, as such, it must be a community effort to
>> resolve. We are happy to lend a hand, but we are not fixing all of this
>> mess.
> I think that the 5 steps above sound correct and I agree that 1) this
> means 1.7.4 is on hold until we fix this and 2) that Mellanox shouldn't be
> the only one to fix this for 1.7.4, given the amount of work involved.
> Ralph, what, specifically, broke about the oob/xoob cpc mechanisms by
> making the oob asynchronous?

Hard for me to say as I don't really have access to an IB machine any more. Odin is my sole reference point, and someone has had that fully locked up for more than a week (and I can't complain as I am totally a guest there). Even then, I can only test on a few nodes.

I have no objection to helping, but we need someone who cares about IB and has access to such a machine to take the lead. Otherwise, we're just spinning our wheels.

As for the work issue: note that this has been "under development" now for more than a year. We've talked at length about how "somebody" needs to fix the openib/ofacm issue, but everyone keeps pushing it down the road as "not mine". Like I said, I can help - but (a) my boss couldn't care less about this issue, and (b) I have no way to test the results.

> That is, 1-5 are a huge amount of work; have
> we done the analysis to say that updating the oob / xoob cpc to work with
> the new oob is actually more work than doing 1-5? Obviously, there's long
> term plans that make oob/xoob problematic. But those aren't 1.7 / 1.8
> plans. Unfortunately, the cpcs were always out of my area of interest, so
> I'm flying a bit more blind than I'd like here.
> Brian
> --
> Brian W. Barrett
> Scalable System Software Group
> Sandia National Laboratories
> _______________________________________________
> devel mailing list
> devel_at_[hidden]