Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] regression with derived datatypes
From: Gilles Gouaillardet (gilles.gouaillardet_at_[hidden])
Date: 2014-05-09 05:08:24

I ran some more investigations with --mca btl scif,self

i found that the previous patch i posted was complete crap and i
apologize for it.

on a brighter side, and imho, the issue only occurs if fragments are
received (and then processed) out of order.
/* i did not observe this with the tcp btl, but i always see that with
the scif btl, i guess this can be observed too
with openib+RDMA */

in this case only, opal_convertor_generic_simple_position(...) is
invoked and does not set the pConvertor->pStack
as expected by r31496

i will run some more tests from now


On 2014/05/08 2:23, George Bosilca wrote:
> Strange. The outcome and the timing of this issue seems to highlight a link with the other datatype-related issue you reported earlier, and as suggested by Ralph with Gilles scif+vader issue.
> Generally speaking, the mechanism used to split the data in the case of multiple BTLs, is identical to the one used to split the data in fragments. So, if the culprit is in the splitting logic, one might see some weirdness as soon as we force the exclusive usage of the send protocol, with an unconventional fragment size.
> In other words using the following flags “—mca btl tcp,self —mca btl_tcp_flags 3 —mca btl_tcp_rndv_eager_limit 23 —mca btl_tcp_eager_limit 23 —mca btl_tcp_max_send_size 23” should always transfer wrong data, even when only one single BTL is in play.
> George.
> On May 7, 2014, at 13:11 , Rolf vandeVaart <rvandevaart_at_[hidden]> wrote:
>> OK. So, I investigated a little more. I only see the issue when I am running with multiple ports enabled such that I have two openib BTLs instantiated. In addition, large message RDMA has to be enabled. If those conditions are not met, then I do not see the problem. For example:
>> Ø mpirun –np 2 –host host1,host2 –mca btl_openib_if_include mlx5_0:1,mlx5_0:2 –mca btl_openib_flags 3 MPI_Isend_ator_c
>> PASS:
>> Ø mpirun –np 2 –host host1,host2 –mca btl_openib_if_include mlx5_0:1 –mca btl_openib_flags 3 MPI_Isend_ator_c
>> Ø mpirun –np 2 –host host1,host2 –mca btl_openib_if_include_mlx5:0:1,mlx5_0:2 –mca btl_openib_flags 1 MPI_Isend_ator_c
>> So we must have some type of issue when we break up the message between the two openib BTLs. Maybe someone else can confirm my observations?
>> I was testing against the latest trunk.