Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] BTL receive callback
From: George Bosilca (bosilca_at_[hidden])
Date: 2009-07-21 12:19:12


Based on your code the only reason I can imagine for the second send
to never be triggered is that the request is considered completed at
that point.

I can't imagine how the free is called without a prior send. If I look
at the code pml_ob1_sendreq.c:1061, the free is only called when the
send fails, but it is always preceded by a send.

Can you check the return values of the ompi_convertor_pack and
prepare_src please?

   george.

On Jul 21, 2009, at 11:55 , Sebastian Rinke wrote:

> Hello,
> I am developing a new BTL component (Open MPI v1.3.2) for a new 3D-
> torus interconnect. During a simple message transfer of 16362 B
> between two nodes with MPI_Send(), MPI_Recv() I encounter the
> following:
>
> The sender:
> -----------
>
> 1. prepare_src() size: 16304 reserve: 32
> -> alloc() size: 16336
> -> ompi_convertor_pack(): 16304
> 2. send()
> 3. component_progress()
> -> send cb ()
> -> free()
> 4. component_progress()
> -> recv cb ()
> -> prepare_src() size: 58 reserve: 32
> -> alloc() size: 90
> -> ompi_convertor_pack(): 58
> -> free() size: 90 Send is missing !!!
> 5. NO PROGRESS
>
> The receiver:
> -------------
>
> 1. component_progress()
> -> recv cb ()
> -> alloc() size: 32
> -> send()
> 2. component_progress()
> -> send cb ()
> -> free() size: 32
> 3. component_progress() for ever !!!
>
> The problem is that after prepare_src() for the 2nd fragment, the
> sender calls free() instead of send() in its recv cb. Thus, the 2nd
> fragment is not being transmitted.
> As a consequence, the receiver waits for the 2nd fragment.
>
> I have found that mca_pml_ob1_recv_frag_callback_ack() is the
> corresponding recv cb. Before diving into the ob1 code,
> could you tell me under which conditions this cb calls free()
> instead of send()
> so that I can get an idea of where to look for errors in my BTL
> component.
>
> Thank you very much in advance.
>
> Sebastian Rinke
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel