Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Paul H. Hargrove (PHHargrove_at_[hidden])
Date: 2007-05-11 15:07:40


Caitlin Bestler wrote:
[snip]
> The DAPL semantics are very clear that send/recv operations must
> be matched one to one, that the receive buffer must be large
> enough for the received message and that there must be a receive
> buffer for each incoming send/recv message. That means that
> the sender needs to have some basis for believing that the
> RECV has been posted. Usually this is an explicit credit
> that is decremented per message and incremented per response.
[snip]

As a former member of both the VI Developers Forum and DAT
Collaborative, and an implementer of VI provider software and of IB
client software, I will back up Caitlin here.

In my own words (not quoting any spec):

1) A correct DAPL consumer *shall* post receives in sufficient quantity
and of sufficient size prior to the peer posting the sends.
2) A DAPL provider's response to a lack of preposted receives is
undefined and may include providing implicit flow control (IB) to
terminating the connection (iWARP and VI).

It appears that IB's forgiving nature here has allowed the BTL to get
away with violating the preposted recv requirement.

As an aside, my personal feeling is that even when running over IB the
preposting of recvs is worth the small overhead of piggybacking a credit
system on the messages that already cross the wire. If nothing else,
this avoids adding congestion of RNR-NAKS and the resends they trigger.
 Put another way, I favor programming for IB as if it lacked the
link-level flow control that the current BTL apparently assumes.

-Paul

-- 
Paul H. Hargrove                          PHHargrove_at_[hidden]
Future Technologies Group
HPC Research Department                   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900