Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [EXTERNAL] Re: [OMPI svn-full] svn:open-mpi r29703 - in trunk: contrib/platform/iu/odin ompi/mca/btl/openib ompi/mca/btl/openib/connect
From: Barrett, Brian W (bwbarre_at_[hidden])
Date: 2013-11-14 12:33:48


On 11/14/13 9:51 AM, "Jeff Squyres (jsquyres)" <jsquyres_at_[hidden]> wrote:

>Does XRC work with the UDCM CPC?
>
>
>On Nov 14, 2013, at 9:35 AM, Ralph Castain <rhc_at_[hidden]> wrote:
>
>> I think the problems in udcm were fixed by Nathan quite some time ago,
>>but never moved to 1.7 as everyone was told that the connect code in
>>openib was already deprecated pending merge with the new ofacm common
>>code. Looking over at that area, I see only oob and xoob - so if the
>>users of the common ofacm code are finding that it works, the simple
>>answer may just be to finally complete the switchover.
>>
>> Meantime, perhaps someone can CMR and review a copying of the udcm cpc
>>to the 1.7 branch?
>>
>>
>> On Nov 14, 2013, at 5:14 AM, Joshua Ladd <joshual_at_[hidden]> wrote:
>>
>>> Um, no. It's supposed to work with UDCM which doesn't appear to be
>>>enabled in 1.7.
>>>
>>> Per Ralph's comment to me last night:
>>>
>>> "... you cannot use the oob connection manager. It doesn't work and
>>>was deprecated. You must use udcm, which is why things are supposed to
>>>be set to do so by default. Please check the openib connect priorities
>>>and correct them if necessary."
>>>
>>> However, it's never been enabled in 1.7 - don't know what "borked"
>>>means, and from what Devendar tells me, several UDCM commits that are
>>>in the trunk have not been pushed over to 1.7:
>>>
>>> So, as of this moment, OpenIB BTL is essentially dead-in-the-water in
>>>1.7.
>>>
>>>
>>>

I'm going to start by admitting that I haven't been paying attention to IB
the last couple of months, so I'm out of my league a little bit here. I
remember discussions of UDCM replacing OOB both because the OOB CPC had
some issues and because it would make it easier to move the BTLs to the
OPAL layer (ie, below the OOB). But I also thought that was more future
work than it clearly was. So can someone let me know:

  1) What the status of UDCM is (does it work reliably, does it support
XRC, etc.)
  2) What's the difference between CPCs and OFACM and what's our plans
w.r.t 1.7 there?
  3) Someone mentioned that ofacm oob worked, but cpc oob didn't. Can
someone explain why?

Again, sorry for being dense; I've been spending too much time in Portals
land lately.

Brian

--
  Brian W. Barrett
  Scalable System Software Group
  Sandia National Laboratories