Good catch. The patch looks ok for me.
Pavel Shamis (Pasha)
On Jun 18, 2010, at 11:10 AM, nadia.derbey wrote:
> Reference is the v1.5 branch
> If an SRQ has the following settings: S,<size>,4,2,1
> 1) setup_qps() sets the following:
> 2) create_srq() sets the following:
> openib_btl->qps[qp].u.srq_qp.rd_curr_num = 1 (rd_init value)
> openib_btl->qps[qp].u.srq_qp.rd_low_local = rd_curr_num - (rd_curr_num
>>> 2) = rd_curr_num = 1
> 3) if mca_btl_openib_post_srr() is called with rd_posted=1:
> rd_posted > rd_low_local is false
> the loop is not executed
> wr is never initialized (remains NULL)
> wr->next: address not mapped
> ==> SIGSEGV
> The attached patch solves the problem by ensuring that we'll actually
> enter the loop and leave otherwise.
> Can someone have a look please: the patch solves the problem with my
> reproducer, but I'm not sure the fix covers all the situations.