On Apr 24, 2014, at 10:47 PM, Ralph Castain <rhc_at_[hidden]> wrote:
>> And, as George pointed out, I see a trend towards heterogeneity in
>> HPC, to I'd say this feature will be rather more important in the
> We have been hearing about such "trends" for a long time, but have yet to see them actually happen. Not saying it couldn't some day - just saying it still hasn't happened in production.
MPI was designed to support heterogeneity all the way back from MPI-1.0 (1994) on these same kinds of arguments. It hasn't really panned out for more than a handful of users.
Keep in mind that data size heterogeneity is an unsolved problem. What do you do if one process sends a 4-byte integer of value 0xff00 0000 to a peer with only 2-byte integers?
>> So, would repairing the code be significantly more complicated than a
>> clean extraction?
> So here's what I suggest: if someone is willing to take the lead in fixing hetero operations, and has the hardware upon which to verify it, then please step forward. Otherwise, I agree with Jeff that we should remove it and move on.
The broken part(s) is(are) likely somewhere in the datatype and/or PML code (my guess). Keep in mind that my only testing of this feature is in *homogeneous* mode -- i.e., I compile with --enable-heterogeneous and then run tests on homogeneous machines. Meaning: it's not only broken for actual heterogeneity, it's also broken in the "unity"/homogeneous case.
So which is more complicated: fix or remove? I don't know; as George mentions, I suspect removal is likely to be a little tricky.
But ask that question a little differently: which is more complicated, long-term maintenance of a feature which no one really tests (or even has the hardware setup to test) or removal?
To me, the answer is a little more clear that way.
That being said, there are 3 disagreements with this RFC so far:
1. George: on principle
2. Andreas: (might) use heterogeneity if it worked
3. Siegmar: uses heterogeneity in older OMPI versions in his SPARC+Intel setups
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/