Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI devel] RFC: Remove heterogeneous support
From: Ralph Castain (rhc_at_[hidden])
Date: 2014-04-28 12:26:36

I'm afraid I honestly don't remember the last time I tested with enable-hetero - at least 2-3 weeks ago. I'd suggest starting ~6 months ago and see if that still worked.

On Apr 28, 2014, at 7:04 AM, George Bosilca <bosilca_at_[hidden]> wrote:

> When did you tested last? I have no idea what is broken so it is difficult to assess the complexity of the fix. Let’s try to find the last working “version” and then run a dihcotomic test to find the culprit (with s hopefully).
> George.
> On Apr 28, 2014, at 09:05 , Ralph Castain <rhc_at_[hidden]> wrote:
>> No, it looks like something has broken it since I last tested. Sorry about the confusion.
>> On Apr 27, 2014, at 10:55 PM, Gilles Gouaillardet <gilles.gouaillardet_at_[hidden]> wrote:
>>> I might have misunderstood Jeff's comment :
>>>> The broken part(s) is(are) likely somewhere in the datatype and/or PML code (my guess). Keep in mind that my only testing of this feature is in *homogeneous* mode -- i.e., I compile with --enable-heterogeneous and then run tests on homogeneous machines. Meaning: it's not only broken for actual heterogeneity, it's also broken in the "unity"/homogeneous case.
>>> Unfortunatly, a trivial send/recv can hang in this case (--enable-heterogeneous and homogenous cluster of little endian procs).
>>> i opened #4568 in order to track this issue
>>> (uninitialized data can cause a hang with this config)
>>> trunk is affected, v1.8 is very likely affected too
>>> Gilles
>>> On 2014/04/28 12:22, Ralph Castain wrote:
>>>> I think you misunderstood his comment. It works fine on a homogeneous cluster, even with --enable-hetero. I've run it that way on my cluster.
>>>> On Apr 27, 2014, at 7:50 PM, Gilles Gouaillardet <gilles.gouaillardet_at_[hidden]> wrote:
>>>>> According to Jeff's comment, OpenMPI compiled with
>>>>> --enable-heterogeneous is broken even in an homogeneous cluster.
>>>>> as a first step, MTT could be ran with OpenMPI compiled with
>>>>> --enable-heterogenous and running on an homogeneous cluster
>>>>> (ideally on both little and big endian) in order to identify and fix the
>>>>> bug/regression.
>>>>> /* this build is currently disabled in the MTT config of the
>>>>> cisco-community cluster */
>>>>> Gilles
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> Subscription:
>>> Link to this post:
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> Subscription:
>> Link to this post:
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> Subscription:
> Link to this post: