Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] scif btl side effects
From: Nathan Hjelm (hjelmn_at_[hidden])
Date: 2014-05-12 18:06:38


Ah, thats good to know. I have no problem with Gilles committing these
sorts of fixes to my components. I went ahead and committed this one
myself though.

-Nathan

On Mon, May 12, 2014 at 01:29:50PM +0000, Jeff Squyres (jsquyres) wrote:
> FWIW, Gilles has singed the OMPI IP agreement, has demonstrated care and knowledge of the OMPI code base, and is an OMPI SVN committer now.
>
> Just be aware that Gilles is about 12 hours off from North America.
>
>
>
> On May 12, 2014, at 9:13 AM, "Hjelm, Nathan T" <hjelmn_at_[hidden]> wrote:
>
> > Hah. Thanks for catching that. I will commit your patch later today.
> >
> > -Nathan
> > ________________________________________
> > From: devel [devel-bounces_at_[hidden]] on behalf of Gilles Gouaillardet [gilles.gouaillardet_at_[hidden]]
> > Sent: Monday, May 12, 2014 4:42 AM
> > To: Open MPI Developers
> > Subject: Re: [OMPI devel] scif btl side effects
> >
> > i wrote this too early ...
> >
> > the attached program produces incorrect results when ran with
> > --mca btl scif,vader,self
> >
> > once the most up-to-date patch of #4610 has been applied, (at least) one
> > bug remain, and it is in the scif btl
> >
> > the attached patch fixes it.
> >
> > Gilles
> >
> > On 2014/05/12 16:17, Gilles Gouaillardet wrote:
> >> Nathan,
> >>
> >> On 2014/05/08 4:21, Hjelm, Nathan T wrote:
> >>> c) that being said, that should work so there is a bug
> >>> d) there is a regression in v1.8 and a bug that might have been always here
> >>> This is probably not a regression. The SCIF btl has been part of the 1.7 series for some time. The nightly MTTs are probably missing one of the cases that causes this problem. Hopefully we can get this fixed before 1.8.2.
> >> as explained in #4610 (https://svn.open-mpi.org/trac/ompi/ticket/4610)
> >> the root cause is in the way data are unpacked.
> >>
> >> The scif btl is ok :-)
> >>
> >> when using --mca btl scif,self fragments can be received out of order,
> >> and that can trigger a bug introduced by r31496
> >>
> >> that being said, --mca btl scif,vader,self does not work with r31496
> >> reverted.
> >> the root cause is an other bug in the way data are unpacked, it happen
> >> also when fragments are received out of order
> >> *and* fragments contain a subpart of a predefined datatype.
> >> in this case, the vader btl received a fragment of size 1325 *and* out
> >> of order and that caused the bug.
> >>
> >> Gilles
> >
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post: http://www.open-mpi.org/community/lists/devel/2014/05/14772.php
>
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: http://www.open-mpi.org/community/lists/devel/2014/05/14773.php



  • application/pgp-signature attachment: stored