Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] scif btl side effects
From: Gilles Gouaillardet (gilles.gouaillardet_at_[hidden])
Date: 2014-05-12 03:17:11


Nathan,

On 2014/05/08 4:21, Hjelm, Nathan T wrote:
> c) that being said, that should work so there is a bug
> d) there is a regression in v1.8 and a bug that might have been always here
> This is probably not a regression. The SCIF btl has been part of the 1.7 series for some time. The nightly MTTs are probably missing one of the cases that causes this problem. Hopefully we can get this fixed before 1.8.2.
as explained in #4610 (https://svn.open-mpi.org/trac/ompi/ticket/4610)
the root cause is in the way data are unpacked.

The scif btl is ok :-)

when using --mca btl scif,self fragments can be received out of order,
and that can trigger a bug introduced by r31496

that being said, --mca btl scif,vader,self does not work with r31496
reverted.
the root cause is an other bug in the way data are unpacked, it happen
also when fragments are received out of order
*and* fragments contain a subpart of a predefined datatype.
in this case, the vader btl received a fragment of size 1325 *and* out
of order and that caused the bug.

Gilles