Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RFC: Remove heterogeneous support
From: Gilles Gouaillardet (gilles.gouaillardet_at_[hidden])
Date: 2014-04-28 01:55:20


I might have misunderstood Jeff's comment :

> The broken part(s) is(are) likely somewhere in the datatype and/or PML code (my guess). Keep in mind that my only testing of this feature is in *homogeneous* mode -- i.e., I compile with --enable-heterogeneous and then run tests on homogeneous machines. Meaning: it's not only broken for actual heterogeneity, it's also broken in the "unity"/homogeneous case.

Unfortunatly, a trivial send/recv can hang in this case
(--enable-heterogeneous and homogenous cluster of little endian procs).

i opened #4568 https://svn.open-mpi.org/trac/ompi/ticket/4568 in order
to track this issue
(uninitialized data can cause a hang with this config)

trunk is affected, v1.8 is very likely affected too

Gilles

On 2014/04/28 12:22, Ralph Castain wrote:
> I think you misunderstood his comment. It works fine on a homogeneous cluster, even with --enable-hetero. I've run it that way on my cluster.
>
> On Apr 27, 2014, at 7:50 PM, Gilles Gouaillardet <gilles.gouaillardet_at_[hidden]> wrote:
>
>> According to Jeff's comment, OpenMPI compiled with
>> --enable-heterogeneous is broken even in an homogeneous cluster.
>>
>> as a first step, MTT could be ran with OpenMPI compiled with
>> --enable-heterogenous and running on an homogeneous cluster
>> (ideally on both little and big endian) in order to identify and fix the
>> bug/regression.
>> /* this build is currently disabled in the MTT config of the
>> cisco-community cluster */
>>
>> Gilles
>>