Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] v1.5: sigsegv in case of extremely low settings in the SRQs
From: nadia.derbey (Nadia.Derbey_at_[hidden])
Date: 2010-06-18 11:10:49


Hi,

Reference is the v1.5 branch

If an SRQ has the following settings: S,<size>,4,2,1

1) setup_qps() sets the following:
mca_btl_openib_component.qp_infos[qp].u.srq_qp.rd_num=4
mca_btl_openib_component.qp_infos[qp].u.srq_qp.rd_init=rd_num/4=1

2) create_srq() sets the following:
openib_btl->qps[qp].u.srq_qp.rd_curr_num = 1 (rd_init value)
openib_btl->qps[qp].u.srq_qp.rd_low_local = rd_curr_num - (rd_curr_num
>> 2) = rd_curr_num = 1

3) if mca_btl_openib_post_srr() is called with rd_posted=1:
rd_posted > rd_low_local is false
num_post=rd_curr_num-rd_posted=0
the loop is not executed
wr is never initialized (remains NULL)
wr->next: address not mapped
         ==> SIGSEGV

The attached patch solves the problem by ensuring that we'll actually
enter the loop and leave otherwise.
Can someone have a look please: the patch solves the problem with my
reproducer, but I'm not sure the fix covers all the situations.

Regards,
Nadia