Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] PML/ob1 problem
From: George Bosilca (bosilca_at_[hidden])
Date: 2009-03-03 07:47:47


Which solution seems to be working ?

This bug was fixed a while ago in the trunk (https://svn.open-mpi.org/trac/ompi/changeset/20591
) and in the 1.3 branch. It even made it in the 1.3.2.

   george.

On Mar 3, 2009, at 05:01 , Lenny Verkhovsky wrote:

> Seems to be working.
> George, can you commit it, pls.
>
> Thanks
> Lenny.
>
>
> On Thu, Feb 19, 2009 at 3:05 PM, Jeff Squyres <jsquyres_at_[hidden]>
> wrote:
>> George -- any thoughts on this one?
>>
>> On Feb 11, 2009, at 1:01 AM, Mike Dubman wrote:
>>
>>>
>>> Hello guys,
>>>
>>> I'm running some experimental tcp btl which implements rdma GET
>>> method and
>>> advertises it in its flags of the btl API.
>>> The btl`s send() method returns rc=1 to select fast path for PML.
>>> (this
>>> optimization was added in revision 18551 in v1.3)
>>>
>>> It seems that in PML/ob1, mca_pml_ob1_send_request_start_rdma()
>>> function
>>> does not treat right such combination (btl GET + fastpath rc>0)
>>> and going
>>> into deadlock, i.e.
>>>
>>> +++ pml_ob1_sendreq.c +670
>>> At this line, sendreq->req_state is 0
>>>
>>> +++ pml_ob1_sendreq.c +800
>>> At this line, if btl has GET method and btl`s send() returned
>>> fastpath
>>> hint - the call to mca_pml_ob1_rndv_completion_request() will
>>> decrement
>>> sendreq->req_state by one, leaving it to -1.
>>>
>>> This value of -1 will keep send_request_pml_complete_check() from
>>> completing request on PML level.
>>>
>>> The PML logic (in mca_pml_ob1_send_request_start_rdma) for PUT
>>> operation
>>> initializes req_state to "2" in pml_ob1_sendreq.c +791, but leaves
>>> req_state
>>> to 0 for GET operations.
>>>
>>> Please suggest.
>>>
>>> Thanks
>>>
>>> Mike.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel