Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RFC: Remove heterogeneous support
From: Ralph Castain (rhc_at_[hidden])
Date: 2014-04-24 22:47:15

On Apr 24, 2014, at 12:05 PM, Andreas Schäfer <gentryx_at_[hidden]> wrote:

> Hey,
> On 14:49 Thu 24 Apr , George Bosilca wrote:
>> On Thu, Apr 24, 2014 at 1:06 PM, Jeff Squyres (jsquyres)
>> <jsquyres_at_[hidden]> wrote:
>>> The code is unused. It has been unused for a long time. It is
>> unlikely to be fixed.
> We'd be using it, probably not in production, but in research and
> teaching -- if it was operational.
> And, as George pointed out, I see a trend towards heterogeneity in
> HPC, to I'd say this feature will be rather more important in the
> future.

We have been hearing about such "trends" for a long time, but have yet to see them actually happen. Not saying it couldn't some day - just saying it still hasn't happened in production.

>> PS: This code has implications from the datatype engine till up in the
>> MPI layer. It also impacts the BTL, especially the hand-shake for the
>> one requiring such a protocol. It also has an impact on the external32
>> support in MPI, for some types of architectures. So it's removal
>> should be an extremely cautious and chirurgical operation.
> So, would repairing the code be significantly more complicated than a
> clean extraction?

Unless someone volunteers to fix it, it would seem the question is moot. My employer isn't interested, and I'm not sure any of the employer's within the OMPI community currently are inclined to support such an effort.

I can't speak to what George is referring to re how it was broken as I honestly don't recall the circumstances. We know it has been broken for some time, and that nobody really has a setup to test it - we can check that it compiles, but I don't think any of us actually have a hetero cluster upon which we could test it.

And as my production code friends keep pointing out - if you can't test it, then you can't "sell" it.

So here's what I suggest: if someone is willing to take the lead in fixing hetero operations, and has the hardware upon which to verify it, then please step forward. Otherwise, I agree with Jeff that we should remove it and move on.

> Cheers
> -Andreas
> --
> ==========================================================
> Andreas Schäfer
> HPC and Grid Computing
> Chair of Computer Science 3
> Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
> +49 9131 85-27910
> PGP/GPG key via keyserver
> ==========================================================
> (\___/)
> (+'.'+)
> (")_(")
> This is Bunny. Copy and paste Bunny into your
> signature to help him gain world domination!
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> Subscription:
> Link to this post: