Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Gleb Natapov (glebn_at_[hidden])
Date: 2007-06-27 14:38:55

On Wed, Jun 27, 2007 at 02:27:34PM -0400, George Bosilca wrote:
> On Jun 27, 2007, at 10:06 AM, Gleb Natapov wrote:
> >>
> >>Btw, did you compare my patch with yours on your multi-NIC system ?
> >>With my patch on our system with 3 networks (2*1Gbs and one 100 Mbs)
> >>I'm close to 99% of the total bandwidth. I'll try to see what I get
> >>with yours.
> >Your patch SEGV on my setup. So can check and compare. I see this in
> >your patch:
> >+ reg = recvreq->req_rdma[bml_btl->btl_index].btl_reg;
> >But bml_btl->btl_index is not an index in req_rdma array and
> >actually we
> >never initialize bml_btl->btl_index at all, so may be it would be a
> >good
> >idea to remove this field at all. TCP never use reg so no problem
> >there,
> >but for IB it should be valid.
> My patch is so old I don't remember. I was quite sure that in the
> beginning I copy and paste from the other function, the one that
> don't take the BTL as an argument. If you replace in the faulty line
> bml_btl->btl_index by recvreq->req_rdma_idx that should work again.
No it will not. recvreq->req_rdma[bml_btl->btl_index].bml_btl not
necessarily should be equal to bml_btl that was passed to the function
so recvreq->req_rdma[bml_btl->btl_index].reg may point to wrong
registration. The right thing to do is to loop over all entries of
recvreq->req_rdma array and find entry corespondent to provided bml_btl
and that is what I am doing in my patch.