On Feb 8, 2008, at 10:38 AM, Ralph Castain wrote:
> I thought maybe we should move this to another thread as it really
> about Torsten's specific RFC.
> I just took a quick gander at the code base to see how extensive this
> problem might really be per Terry's concern. What I found was that
> we have
> added 3rd party code in several places. How we want to define them
> in terms
> of this issue is probably something for discussion.
> Packages I could readily identify include:
> 1. event library
> 4. backtrace
> 5. PLPA - this one is a little less obvious, but still being
> released as a
> separate package
FWIW, these packages are part of "core" OMPI and are not especially
problematic. We upgrade them when we have a need or desire to (which
has been low frequency); we don't try to stay in sync with their
release schedules at all.
> 2. ROMIO
ROMIO has traditionally been a problem (keeping up with its releases
and patches). We have long-since agreed that we definitely want to
include ROMIO in our tarball, even though that presents challenges.
One thing that makes it *slightly* easier is that Brian added the
mechanics for OMPI to use a ROMIO that is outside of Open MPI rather
than the one that is bundled with it. It's not a perfect solution,
but it does help some.
> 3. VT
> 6. libNBC
These two are definitely in the "contrib" category.
> There may well be others - these are only the ones I know about. By
> party package, I mean these are blocks of code obtained as a complete,
> distinct version and "dropped in" to the OMPI code repository, and
> then to
> some degree tied into our build system. They are not code specifically
> developed for OMPI by OMPI developers.
Those are all that I'm aware of.
> We have already discussed the issues with this approach. I am
> concerned with the maintenance and release cycle issues right now.
> If these packages could be linked to our code instead of embedded
> within it,
> then it seems to me that updating them could become much easier. For
> example, we could download and install the latest ROMIO + Panasas
> compile it, and simply link it into libompi - without occupying
> someone with
> constantly fixing the build system issues, etc.
- event,backtrace,PLPA,ROMIO are included in OMPI because we wanted to
certify them as part of "core" OMPI. That is, we wanted to certify
the whole system (vs. relying on [untested] combinations of versions
that already exist on users' systems).
- ROMIO is likely the only one of that group that presents ongoing
logistics problems. The mechanism Brian added was seen as a
workaround. Argonne will definitely need to be involved at some level
to improve the ROMIO integration. Some talks started between Brian,
me, and Rob(ANL) about a) making our integration better/easier, and b)
having access to the ROMIO SVN to be able to suck down releases when
we want to, but they kinda tapered off (Brian left and I got other
priorities). There was also talk of LANL maintaining its own ROMIO
tree and pushing it into OMPI, but I don't know what happened there.
I can help with part of the ROMIO make-the-integration-easier (not in
the immediate future, though -- probably not for a few weeks), but I
do not think that I can do it on an ongoing basis. Note, too, that
ROMIO is no longer distributed as a separate package -- it's only
included in MPICH2. So it's a little harder to just link against a
ROMIO that is already installed on a system -- there won't be one that
isn't already bundled with an MPI.
- vt and libnbc are a different category; they are add-on
functionality, not "core" OMPI.
> Obviously, I don't claim to know enough about what was done to
> ROMIO to know if this would easily work. I only use it to illustrate
> point - the same could be said about the event library, for example.
> Given our maintenance support problems, it would seem to me that
> the way we do 3rd party packaging may be worth consideration and some
> effort. I can't prioritize that relative to 1.3, though I do note
> that, from
> LANL's perspective, the ROMIO issue is a definite blocker for 1.3
Hmm. This is odd because of the prior statements about ROMIO from
LANL (that LANL was going to maintain ROMIO and push it into OMPI).
I'm assuming that's changed?
If ROMIO is a v1.3 blocker for LANL, can LANL commit resources to
fixing the problem?