Actually, we should then also print out a different error message when
RNR occurs in PP QP's, too. It should be something along the lines of
"flow control problem occurred; this shouldn't happen..." (right now
it says RNR happened, and goes into detail into what that means -- but
that's not the real problem).
I'll do that as well.
On Feb 13, 2008, at 12:59 AM, Gleb Natapov wrote:
> On Tue, Feb 12, 2008 at 05:41:13PM -0500, Jeff Squyres wrote:
>> I see that in the OOB CPC for the openib BTL, when setting up the
>> side of the QP, we set the rnr_retry value depending on whether the
>> remote receive queue is a per-peer or SRQ:
>> - SRQ: btl_openib_rnr_retry MCA param value
>> - PP: 0
>> The rationale given in a comment is that setting the RNR to 0 is a
>> good way to find bugs in our flow control.
>> Do we really want this in production builds? Or do we want 0 for
>> developer builds and the same btl_openib_rnr_retry value for PP
> The comment is mine and IMO it should stay that way for production
> builds. SW flow control either work or it doesn't and if it doesn't I
> prefer to know about it immediately. Setting PP to some value greater
> then 0 just delays the manifestation of the problem and in the case of
> iWarp such possibility doesn't even exists.
> devel mailing list