Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI devel] regression with derived datatypes
From: Gilles Gouaillardet (gilles.gouaillardet_at_[hidden])
Date: 2014-05-07 23:03:38


On 2014/05/08 2:15, Ralph Castain wrote:
> I wonder if that might also explain the issue reported by Gilles regarding the scif BTL? In his example, the problem only occurred if the message was split across scif and vader. If so, then it might be that splitting messages in general is broken.
>
i am afraid there is a misunderstanding :
the problem always occur with scif,vader,self (regardless the ompi v1.8
version)
the problem occurs with scif,self only if r31496 is applied to ompi v1.8

In my previous email
http://www.open-mpi.org/community/lists/devel/2014/05/14699.php
i reported the following interesting fact :

with ompi v1.8 (latest r31678), the following command produces incorrect
results :
mpirun -host localhost -np 2 --mca btl scif,self ./test_scif

but with ompi v1.8 r31309, the very same command produces correct results

Elena pointed that r31496 is a suspect. so i took the latest v1.8
(r31678) and reverted r31496 and ...

mpirun -host localhost -np 2 --mca btl scif,self ./test_scif

works again !

note that the "default"
mpirun -host localhost -np 2 --mca btl scif,vader,self ./test_scif
still produces incorrect results

in order to reproduce the issue, a MIC is *not* needed,
you only need to install the software stack, load the mic kernel module
and make sure you can read/write /dev/mic/*

bottom line, there are two issues here :
1) r31496 broke something : mpirun -np 2 -host localhost --mca btl
scif,self ./test_scif
2) something else never worked : mpirun -np 2 -host localhost --mca btl
scif,vader,self ./test_scif

Gilles