On Fri, 8 Feb 2008, Ralph Castain wrote:
> 1. event library
> 2. ROMIO
> 3. VT
> 4. backtrace
> 5. PLPA - this one is a little less obvious, but still being released as a
> separate package
> 6. libNBC
Sorry to Ralph, but I clipped everything from his e-mail, then am going to
make references to it. oh well :).
One minor correction -- the entire backtrace framework is not a third
party deal. The *DARWIN/Mac OS X* component relies heavily on third party
code, but the others (Linux and Solaris) are just wrappers around code in
their respective C libraries.
I believe I was responsible for the event library, ROMIO, and backtrace
before leaving LANL. I'll go through the motivations and issues with all
three in terms of integration.
Event Library: The event library is the core "rendezvous" point for all of
Open MPI, so any issues with it cause lots of issues with Open MPI in
general. We've also hacked it considerably since taking the original
libevent source -- we've renamed all the functions, we've made it thread
safe in a way the author was unwilling to do, we've fixed some performance
issues unique to our usage model. In short, this is no longer really the
same libevent that might already be installed on the system. Using such
an unmodified libevent would be disasterous.
ROMIO is actually one that there was significant discussion about prior to
me leaveing Los Alamos. There are a number of problems / issues with
ROMIO. First and foremost, without ROMIO, we are not a fully compliant
MPI implementation. So we have to ship ROMIO -- it's the only way to have
that important check mark. But its current integration has some issues --
it's hard to test patches independently. There is actually a mode in the
current Open MPI tree where the MPI interface to MPI-I/O is not provided
by OPen MPI and no io components are built. This is to allow users to
build ROMIO independently of Open MPI, for testing updates or whatever.
There are some disadvantages to this. First, the independent ROMIO will
use generalized requests instead of being hooked into our progress engine,
so there may be some progress issues (I never verified either way).
Second, it does mean dealing with another package to build on the user's
site. Jeff is correct --there was discussion about how to make the
integration "better" -- many of the changes were on our side, and we were
going to have to ask for a couple of changes from Argonne. If someone is
going to put in the considerable amount of time to make this happen, I'm
happy to write up whatever notes I can remember / find on the issue.
The Darwin backtrace component is mostly maintanance free. It doesn't
support 64-bit Intel chips, but that's fine. Once every 18 months or so,
I need to get a new copy for the latest operation system, although the
truth is I don't think anything bad happens if we just stop doing the
updates at OS release (by the way, I did the one for Leopard, so we're
probably all going to be sick of MPI and on to other things before the
next time it has to be done). While it's useful, if the community is
really worried, it could probably be deleted. But having a stack trace
when you segfault sure is nice :).