Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Greg Lindahl (lindahl_at_[hidden])
Date: 2005-03-15 00:35:05


Jeff,

One of the interesting aspects of an ABI MPI is that some folks won't
be convinced that it's interesting until they've had it proven that it
will work, both technically and socially, which is putting the cart
before the horse to a certain extent. It's also not helpful that all I
distributed were power-point slides; I did have a detailed writeup,
but it gave proposed solutions to too many problems, which is likely
to cause more arguments than it would solve.

In any case, discussion is a good thing, and here are some answers to
the issues you raise. I'm afraid that a casual observer won't
necessarily learn much from our exchange about whether or not an ABI
is possible or desirable. I also don't think I have all the answers,
but a few of the issues you raise do delve into things that I've
worked on in the past.

1. An ABI still leaves you vulnerable to different C++ and F90 issues.

On AMD64/EM64T and x86, Intel, gcc, and PathScale have compatible C++
implementations. This is not a coincidence.

The F90 issue is real, but note that it reduces to N libraries for
N compilers, instead of N*M libraries for N compilers and M MPIs.

1a. You're vulnerable to OS issues.

Yes. People who distribute binaries today already deal somewhat
successfully with that. We distribute only 4 sets of compiler RPMs,
but they happen to work successfully on RedHat Enterprise, Fedora, and
9, SUSE Professional and Enterprise, Mandrake, Debian, Gentoo, White
Hat, Rocks, and a few others I've forgotten because they're too
obscure.

2. Choosing an MPI implementation

One existing solution is "modules". It takes the guesswork out of
setting and resetting PATH. It's definitely something that has to be
scripted or users can't use it reliably. You are correct that setting
LD_LIBRARY_PATH in an MPI job is an issue, but this needs a fix
anyway, as users run up against it all the time -- "Your Fortran docs
say I need to set this environment variable to change the output
format; it works with a serial code, but doesn't work with MPI! Help!"

3. This isn't an immediately perfect solution for all ISVs

Well, yes. I actually said that in my slides. For the average ISV,
being able to test other MPIs without recompiling is a first step
towards being able to eventually support more than one. But it's not
the only step. An above-average ISV will then be willing to let
customers use other MPIs, as long as any bugs are replicated using a
supported platform (i.e. using OpenMPI over TCP/IP). I expect it to
take a while for the average ISV to get there, but at least they have
a chance. Today they have no chance.

You also mention that some ISVs tend to distribute static linked
executables. This is true. Were I an ISV, I would distribute such an
executable for debugging purposes.

4. Non-standardized parts of MPI are an issue in real programs.

I agree with this one. I think these issues should be discussed as
part of the ABI effort. What parts are the worst problems, and can we
successfully standardize them?

5. Biologists and Chemists don't care about an ABI.

Yes, but they do care about having their apps just work. And an ABI
helps that. I can imagine a future in which an rpm for a biology app
distributes a test suite, and MPI implementers / interconnect vendors
who care will download the rpm and test it. That's easier with an ABI
than without; with an ABI you don't have to rebuild (which, for a
poorly packaged app, can be very tricky for a 3rd party to do), and
you can be fairly sure that you didn't introduce/delete a bug by
recompiling.

6. ISVs can ship MPI-independent apps today using a thin abstraction
layer.

Yes, but that wastes money and complicates testing. I wouldn't be
surprised if few of them did that. I suspect the ISVs that did that,
if any, would like an ABI more than their current system.

7. It's expensive to change sizes and types of MPI handles to a new
standard

Look at your f77 interface -- it's a roadmap for converting from
fortran integer arguments to pointers. This is what I suggest you do
for C/C++. I'd be happy to write the code for you, actually.

8. Embedded MPI might have different constraints

An ABI doesn't have to be all things to all people. I find that most
HPC embedded systems are based on a fairly general-purpose platform,
so an ABI may manage to reach most of today's embedded players. (We
do do business in that space.)

9. Is recompiling really that difficult?

I don't think you or I are the right people to answer that one, we
recompile all the time. But I assure you that the compiler people in
my company would answer YES! YES! Why? Because compilers are heavily
QAed, and we can name numerous source files in our compiler that have
obscure bugs appear if compiled with gcc -O2 instead of -O0. These
bugs come and go, too, as we rearrange our source code. So for them,
they would never send a compiler binary out the door that had not been
fully QAed -- that exact binary. No "let's recompile for one fix and
then not re-run the entire test suite."

10. OpenMPI implements multiple batch queue systems already.

I did not mean to imply that no one had did this. In fact, several
MPIs have done it. But that doesn't help the average cluster admin,
who only has 1 batch queue system.

Likewise, many "portable" MPIs support multiple interconnects. But
that doesn't solve the issues associated with multiple interconnects,
for many reasons that have been stated already.

I do find it entertaining that you say supporting several run-times
is not difficult, while changing to support one mpi.h is expensive ;-)

11. Uber-mpirun sounds hard

I have a strawman idea for it; that's the kind of thing the committee
should discuss, not to be hashed out completely before the committee
ever meets. I've read many MPI startups and the strawman addresses all
of them. Quadrics and Portals both work with the concept.

I wouldn't be surprised if Uber-mpirun didn't make it into the ABI.
I'm OK with that, I don't think leaving it out fatally weakens the ABI
concept.

12. MPI-2 dynamic functions are hard to standardize

Yes, and lack of integration with queue systems is why I've never
observed a customer using them, ever. A quality-of-implementation
consensus would help these functions become useful, and is currently
crying out for action.

> Finally, this uber-mpirun will have a consistent syntax across all
> platforms and RTEs, but what about mpiexec? The MPI Forum explicitly
> specified mpiexec to fulfill this requirement. Has it failed? Are all
> the mpiexec implementations out there so radically different as to be
> useless in terms of uniform syntax? (this is an honest question)

I don't know, I've never used it. The fact that it isn't mentioned in
the slides is simple ignorance on my part ;-) One would hope that mpiexec
was complete enough, but it would be an interesting exercise to see
if, for the purposes of this ABI, it was or not.

> Having an MPI ABI gains nothing for MPI
> implementation researchers except that they don't have to recompile
> applications for their new implementation. This is exactly the same as
> it is for everyone else (per restrictions discussed above); singling
> out MPI implementation researchers is misleading.

I'm sorry if you find it misleading. I think it is useful to
specifically call out the benefit to an extremely important group of
people, the people for whom an MPI ABI is an expense. You are in that
group, and I think an ABI won't get going if you can't be convinced.
Hence I write 232 line emails...

> - On the "Winners: Interconnect implementors" slide: Why will
> interconnect implementors only reach systems that recompile?

I was referring to recompiling *applications*, not the MPI
implementation. Quadrics isn't supported by most of the "portable"
MPIs today. So a Quadrics customer has to compile applications
themselves.

> - On the "Winners: Commercial software vendors" slide: I talked about
> this above. An ABI does *not* make testing easier --

It does, because you don't have to recompile, thus you can be confident
that you don't have new compiler issues, you'll only have MPI issues.

> Just because they don't have to recompile will not significantly reduce
> the logistics of all ISV's.

An ABI doesn't have to significantly help all ISVs to be successful.

> I don't see how automated testing becomes easier with an ABI.

If I don't have to recompile my ten million line application for each
interconnect, that's easier. I would suspect it cuts the work in half,
actually. Inside PathScale, for example, we build all 4 platforms
twice a day for each active branch, and archive *all* the builds, so
we can binary search for regressions. That's our minimum effort for "a
different recompile."

> - On the "Winners: Open Source Software Projects" slide: you say
> "Tomorrow, MPI is just like everything else..." Are you saying that
> MPI will be DLL Hell just like all other packages out there?

No, that it will mostly work, like most things today. DLL hell in
Linux, from my experience, happens when you want to use an RPM that's
too wildly different, or that depends on multiple not-commonly-
distributed libraries. The typical biologist downloads a set of RPMs
built for nearly-exactly the system they're on; they stay relatively
up to date on Linux distros because the people building the RPMs tell
them to do that.

> - On the "Issues: Startup and queue systems" slide: it sounds like you
> are now talking about standardizing queue systems which is a much, much
> larger effort than just the MPI (or even the HPC) community.

No, that was not my intent. I was talking about how uber-mpirun
interfaces, in a generic way, to queue systems. I know that
standardizing anything about queue systems is bound to fail, I've
personally seen a couple of such efforts not accomplish anything.
That's one thing I like about the MPI process, it has proven to be a
big success so far.

To sum up, if such a huge scatter-brained email can be summed up, I
think the cart is a bit before the horse in this discussion, and that
it would pay for us to listen to some of the people with bigger
benefit from an ABI before deciding it isn't interesting. After all,
MPI implementers pay the main cost of implementing an ABI, so it's no
surprise they're the hardest to convince. But, I think that this
discussion is an important step towards convincing people, hopefully
including you.

-- greg