Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] BTL receive callback
From: Sebastian Rinke (rinke_at_[hidden])
Date: 2009-07-23 11:37:16


> I am curious if you are indeed using a new interconnect (new
> hardware and protocol) or if it is requirements of the 3D-torus
> network that are not addressed by the openib btl that are driving
> the need for a new btl?

It is the first one.

Sebastian.

> On 07/21/09 11:55, Sebastian Rinke wrote:
>> Hello,
>> I am developing a new BTL component (Open MPI v1.3.2) for a new
>> 3D-torus interconnect. During a simple message transfer of 16362 B
>> between two nodes with MPI_Send(), MPI_Recv() I encounter the
>> following:
>>
>> The sender:
>> -----------
>>
>> 1. prepare_src() size: 16304 reserve: 32
>> -> alloc() size: 16336
>> -> ompi_convertor_pack(): 16304
>> 2. send()
>> 3. component_progress()
>> -> send cb ()
>> -> free()
>> 4. component_progress()
>> -> recv cb ()
>> -> prepare_src() size: 58 reserve: 32
>> -> alloc() size: 90
>> -> ompi_convertor_pack(): 58
>> -> free() size: 90 Send is missing !!!
>> 5. NO PROGRESS
>>
>> The receiver:
>> -------------
>>
>> 1. component_progress()
>> -> recv cb ()
>> -> alloc() size: 32
>> -> send()
>> 2. component_progress()
>> -> send cb ()
>> -> free() size: 32
>> 3. component_progress() for ever !!!
>>
>> The problem is that after prepare_src() for the 2nd fragment, the
>> sender calls free() instead of send() in its recv cb. Thus, the 2nd
>> fragment is not being transmitted.
>> As a consequence, the receiver waits for the 2nd fragment.
>>
>> I have found that mca_pml_ob1_recv_frag_callback_ack() is the
>> corresponding recv cb. Before diving into the ob1 code,
>> could you tell me under which conditions this cb calls free()
>> instead of send()
>> so that I can get an idea of where to look for errors in my BTL component.
>>
>> Thank you very much in advance.
>>
>> Sebastian Rinke
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>