Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2007-05-09 07:51:07


On May 9, 2007, at 1:37 AM, Or Gerlitz wrote:

> Doing a bit of zoom out from the "how to make ofed's udapl work for
> ompi" thread, my thinking is that the ompi udapl btl enablement is
> actually only the first step, where for production/longterm/etc you
> want to have an rdmacm btl.

I think this is a bit of a misunderstanding. The "BTL" in Open MPI
is a byte transfer layer; it is a point-to-point abstraction for
moving bytes between two processes. BTL components (read: plugins)
are typically distinguished by the underlying protocols used. For
example, we have an RC verbs-based BTL and we have a separate uDAPL-
based BTL. Andrew is also working on a research-quality UD verbs-
based BTL.

Hence, how a particular BTL component makes connections between
process peers is really a side-effect of moving bytes around, and not
the focus of the BTL. So having a "rdmacm" BTL doesn't really make
sense. If both the RC and UD verbs-based BTLs someday use the RDMA
CM for connections, we might abstract the connection management out
to a common piece of code between the two. But that's a different
issue. If we end up having a mixed BTL someday that uses both RC and
UD, then the need for the common code may go away. But that's in the
future.

> Reasoning here is made of many arguments, among them the quickest i
> can make are:
>
> A) it seems that ompi would want to use not only RC but rather also
> UD multicast and unicast, which are not covered by udapl
>
> B) there's actually no real justification to maintain two APIs
> (namely udapl vs libibvers/librdmacm), so down the road, only one
> of them would survive (udapl is implemented ***over*** libibverbs/
> librdmacm so if the latteres dies same does udapl). Specifically, I
> hear here and there that the OFED stack is now on its way to be
> deployed all over the place, specifically in commercial Unix OSs
> (which want modern! code that supports IPoIB-CM,RDS,SRP,iSER, etc
> you named it) so eventually the rdmacm btl can be used also over
> Solaris et al.

I think that's not quite the point.

1. A piece of history: the uDAPL BTL was originally developed by a
grad student just as an excuse to learn the BTL interface and OMPI
internals. We already had an RC verbs-based BTL at the time.

2. When Sun joined Open MPI, they took over the development and
maintenance of the uDAPL BTL because uDAPL is the only high
performance stack on Solaris.

3. It's fine that Sun will someday support the same verbs interface
that OFED does. But *today*, they don't. So for their current
customers, they need to support uDAPL. As such, we have done little/
no testing of uDAPL on OFED since Sun took over the uDAPL BTL -- all
testing since that point has been on Solaris uDAPL. All of our Linux/
OFED efforts have been on the verbs interface.

4. The Open MPI focus on uDAPL over OFED at the moment is simply to
jump-start iWARP testing. Both NetEffect and Chelsio have chimed in
to say that they will do the RDMA CM work for Open MPI, but uDAPL can
be used as a temporary workaround that can be used [effectively]
immediately while they get up to speed on the Open MPI code base and
do the RDMA CM work.

-- 
Jeff Squyres
Cisco Systems