Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2007-05-10 19:12:10


Steve --

Can you file a trac bug about this?

On May 10, 2007, at 6:15 PM, Steve Wise wrote:

>
>>
>> There are two new issues so far:
>>
>> 1) this has uncovered a connection migration issue in the Chelsio
>> driver/firmware. We are developing and testing a fix for this now.
>> Should be ready tomorrow hopefully.
>>
>
> I have a fix for the above issue and I can continue with OMPI testing.
>
> To work around the client-must-send issue, I put a nice fat sleep
> in the
> udapl btl right after it calls dat_cr_accept(), in
> mca_btl_udapl_accept_connect(). This, however, exposes another issue
> with the udapl btl:
>
> Neither the client nor the server side of the udapl btl connection
> setup
> pre-post RECV buffers before connecting. This can allow a SEND to
> arrive before a RECV buffer is available. I _think_ IB will handle
> this
> issue by retransmitting the SEND. Chelsio's iWARP device, however,
> TERMINATEs the connection. My sleep() makes this condition happen
> every
> time.
>
>> From what I can tell, the udapl btl exchanges memory info as a first
> order of business after connection establishment
> (mba_btl_udapl_sendrecv(). The RECV buffer post for this exchange,
> however, should really be done _before_ the dat_ep_connect() on the
> active side, and _before_ the dat_cr_accept() on the server side.
> Currently its done after the ESTABLISHED event is dequeued, thus
> allowing the race condition.
>
> I believe the rules are the ULP must ensure that a RECV is posted
> before
> the client can post a SEND for that buffer. And further, the ULP must
> enforce flow control somehow so that a SEND never arrives without a
> RECV
> buffer being available.
>
> Perhaps this is just a bug and I opened it up with my sleep()
>
> Or is the uDAPL btl assuming the transport will deal with lack of RECV
> buffer at the time a SEND arrives?
>
> Also: Given there is a message exchange _always_ after connection
> setup,
> then we can change that exchange to support the client-must-send-first
> issue...
>
>
> Steve.
>
>

-- 
Jeff Squyres
Cisco Systems