Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

From: Donald Kerr (Don.Kerr_at_[hidden])
Date: 2007-05-08 16:21:18

Steve Wise wrote:

>On Tue, 2007-05-08 at 13:57 -0400, Andrew Friedley wrote:
>>Steve Wise wrote:
>>>>Well I've tried OMPI on ofed-1.2 udapl today and it doesn't work. I'm
>>>>debugging now.
>>>Here's part of the problem (from ompi/btl/udapl/btl_udapl.c):
>>> /* TODO - big bad evil hack! */
>>> /* uDAPL doesn't ever seem to keep track of ports with addresses. This
>>> becomes a problem when we use dat_ep_query() to obtain a remote address
>>> on an endpoint. In this case, both the DAT_PORT_QUAL and the sin_port
>>> field in the DAT_SOCK_ADDR are 0, regardless of the actual port. This is
>>> a problem when we have more than one uDAPL process per IA - these
>>> processes will have exactly the same address, as the port is all
>>> we have to differentiate who is who. Thus, our uDAPL EP -> BTL EP
>>> matching algorithm will break down.
>>> So, we insert the port we used for our PSP into the DAT_SOCK_ADDR for
>>> this IA. uDAPL then conveniently propagates this to where we need it.
>>> */
>>> ((struct sockaddr_in*)attr.ia_address_ptr)->sin_port = htons(port);
>>> ((struct sockaddr_in*)&btl->udapl_addr.addr)->sin_port = htons(port);
>>>The OMPI code stuffs the port chosen by udapl for a listening endpoint
>>>into the ia address memory (which is owned by the udapl layer btw).
>>>There's a slight problem with that: The OFA udapl openib_cma code binds
>>>cm_id's to this ia_address regularly. When an hca is opened, a cm_id is
>>>bound to this address to obtain the local hca port number and gid that
>>>is being used. In addition, a cm_id is bound to this address each time
>>>an endpoint is created (either at ep_create time or ep_connect time).
>>>So that ia_address field is used by the dapl cm to create local
>>>cm_ids... Since the port was always zero, the rmda-cma would choose a
>>>unique port for each cm_id bound to that address.
>>>But OMPI sets a the port field to non-zero, the rdma_cma fails all the
>>>subsequent rdma_bind_addr() calls since the port is already in use.
>>>Perhaps this hack really is a workaround for a DAPL bug where somebodies
>>>dapl wasn't tracking port numbers correctly?
>>Yep. My memory is dim, but I think that was OFED's DAPL, or it was in
>>the generic part of DAPL that all implementations seem to share.
>>As hinted by the comment (I wrote it by the way), I think the best
>>solution would be if dat_ep_query() returned the port number correctly.
>> Most of uDAPL seems to just pass around pointers to internal data
>>structures (which I'm not sure is the best idea in the world), so it
>>didn't seem like a trivial fix to me at the time. I remember
>>considering reporting this as a bug, but I didn't because the uDAPL
>>standard didn't seem to enforce any requirements on passing the port
>>number around with the address, so it technically wasn't wrong.
>>Was the OFED uDAPL code switched from something else to RDMA CM at some
>>point? I'm almost certain I was running fine on OFED's uDAPL at one
>>point (in fact, a lot of the uDAPL BTL development I did was using the
>>OFED stack).
>Yes, the OFA uDAPL was changed from using the ib-cm to the rdma-cm a
>while back. Perhaps you ran on the ib-cm version? And, the rdma-cma
>started using port numbers and enforcing uniqueness even more recently I
>Perhaps Don Kerr has some insight on how the Sun uDAPL behaves? Should
>OMPI still need this hack?
 From what I recall, and Andrew can probably set me straight if I get
this wrong. This hack was included because we were not able to pull the
remote port from dat_ep_query. If dat_ep_query supplies that data then
we could probably do away with the hack.

I have not heard back from the developer at Sun who implemented uDAPL
for Solaris. My thought is that it was also based on the older ib-cm but
will confirm. I submitted a bug against Solaris uDAPL to provide the
port via dat_ep_query awhile back and it looks like it has been fixed, I
just have not tested this because we weren't using it.