I see that in the OOB CPC for the openib BTL, when setting up the send
side of the QP, we set the rnr_retry value depending on whether the
remote receive queue is a per-peer or SRQ:
- SRQ: btl_openib_rnr_retry MCA param value
- PP: 0
The rationale given in a comment is that setting the RNR to 0 is a
good way to find bugs in our flow control.
Do we really want this in production builds? Or do we want 0 for
developer builds and the same btl_openib_rnr_retry value for PP queues?
Or should we offer a finer-grained control, such as:
- btl_openib_rnr_retry_pp: value to use for per-peer q's, -1=use the
- btl_openib_rnr_retry_srq: value to use for srq's, -1=use the default
- btl_openib_rnr_retry: value to use as the default for _pp and _srq