Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] BTL move - the notion
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-12-11 13:54:32

(chiming in a bit after the fact)

In general, I agree with most of what has been stated.

1. The BTLs should remain "owned" by Open MPI. There are OMPI member
organizations in multiple projects that want to use the BTLs, but the
BTLs are primarily for the Open MPI project.

2. An incremental patch approach would likely be best; my definition
of that would be "small branch and merge". I strongly endorse hg or
git for this; they are *VERY* good at exactly this kind of thing.
Much, much larger code bases than OMPI pervasively use hg/git for the
branch/patch/merge model with very good success. If you "grew up" on
CVS/SVN (and earlier), this may seem counter-intuitive -- but please
realize that tools have evolved significantly since then.

3. Moving the BTL code to different parts of the source tree won't
matter much in terms of performance and (mostly) abstractions. But we
should check, just to make sure we didn't muck something up. This is
a complex code base, after all.

4. Adding new functionality to the BTL (e.g., bootstrapping) is
subject to #1.

5. Ralph outlined the case for tighter integration with the RTE and
the BTLs. I think it's reasonable, and I agree with his case. We can
add abstractions to ensure that nothing is ORTE-specific and to ensure
that we can safely handle if some other underlying RTE doesn't have
the same capabilities (none of this stuff is likely to be in the
performance-critical code path, so it's not too much of an issue).
But allowing other RTE's under the OMPI MPI layer shouldn't restrict
what we want/can do with our own OMPI-specific RTE.

Just my $0.00000000000002....

On Dec 5, 2008, at 11:10 AM, Richard Graham wrote:

>> > think we all agree that STCI and OMPI have different objectives
>> and requirements. OMPI is facing the need to launch and operate at
>> extreme scales by next summer, has received a lot of interest in
>> having it report errors into various systems, etc. We don't have
>> all the answers as to what will be necessary to meet these
>> requirements, but indications so far are that tighter integration,
>> not deeper abstraction, between the various layers will be needed.
>> By that, I don't mean we will violate abstraction layers, but
>> rather that the various layers need to work more as a tightly tuned
>> instrument, with each layer operating based on a clear knowledge of
>> how the other layers are functioning.
> OMPI and STCI are two different things together, and I have vested
> interest in both, and have no desire
> to have either go south. You have a set of requirement at LANL
> which are
> important, and we also have a set of requirement at ORNL, and as
> such we need to compromise on these
> in the code base. We have MPI level goals, which will be
> accomplished in the OMPI code base, and
> tools and other related goals that will be accomplished in other
> code bases.
> We both have the need to function well at the high end, so have the
> same set
> of goals there.
> >
> > For example, for modex-less operations, the MPI/BTLs have to know
> that the RTE/OS will be providing certain information. This means
> that they don't have to go out and discover it themselves every
> time. Yes, we will leave that as the default behavior so that small
> and/or unmanaged clusters can operate, but we have to also introduce
> logic that can detect when we are utilizing this alternative
> capability and exploit it. While we are trying our best to avoid
> introducing RTE-like calls into the code, the fact is that we may
> well have to do so (we have already identified one btl that will
> definitely need to). It is simply too early to make the decision to
> cut that off now - we don't know what the long-term impacts of such
> a decision will be.
> This is where discussions will need to go both ways. Your changes
> also can impact us, and we need to agree
> to those changes, just as much as you need to agree with the changes
> we are proposing. This is not a code
> base focused on a single institution's requirements, and we all do
> our best (and I believe tend to
> succeed) at helping meet all of our needs.
> >
> > Finally, although I don't do much on the MPI layer, I am concerned
> about performance. I would tend to oppose any additional abstraction
> until we can measure the performance impact. Thus, I would like to
> see the BTL move done on a tmp branch (technology to branch up to
> the implementer - I don't care) so we can verify that it isn't
> hurting us in some unforeseeable manner.
> Agreed - at least for the last phase of what we are suggesting, but
> we can talk about this. I am a bit
> confused about how the location of the source code has anything to
> do with how it performs at run-time.
> At this stage we have said nothing about changing the way the btl
> works, just cosmetic things. When it
> comes to enabling the use of stci with ompi, then these issues will
> come up, and need to be addressed
> very carefully. To be honest, since we don't want to change the
> btl's (aside from add some attributes)
> I don't expect this to be an issue, UNLESS we end up needing to
> change some data structures for abstraction
> purposes. This is where we need to be very careful. If you look at
> what has happened with the btl's
> (actually first the PTL's) historically, I have been one of the ones
> pushing hard for improved performance -
> why would this change now ?
> >
> >
> > So I guess my concerns really boil down to dealing with
> conflicting schedules and requirements, how to support multiple
> possibly competing groups that want to share one or more parts of
> our code base, and retaining an OMPI-first philosophy when it comes
> to what changes get made. My proposed solution is:
> This is the problem we face all the time, and on a regular basis we
> as a community do our best to help
> each other out. This is one of the reasons 1.3 is as late as it is,
> and this is a good thing that will
> continue as long as this is a community project.
> >
> > 1. shift our repository to a technical solution that supports
> broader code sharing
> >
> > 2. have the non-OMPI groups access our code base via that
> technology. They can "pull" changes at will, subject to the
> licensing agreement. It is true that they may have to do some local
> editing if the change hits a spot where they have local mods to
> support their system, but both Hg and GIT are very good at handling
> this - much better than svn ever has been.
> >
> > 3. if there are minor mods required to make the BTL code area
> easier to share via the above methods, then we should explore and
> implement them. Certainly, renaming #define values would seem a no-
> brainer. I suspect there are other similar things that could be
> done. Removing orte/opal dependencies is more controversial and
> would need to thoroughly be examined.
> >
> > 4. OMPI decides what changes get made to its code base. We are
> polite about it and talk to the other groups to try and minimize
> impact, but ultimately we do what is best for OMPI, and send out
> notifications (perhaps a new mailing list specifically for that
> purpose) when changes occur. Note that this would have helped the
> Eclipse group enormously as otherwise they drown in the devel list
> trying to spot the changes.
> I don't see that anything else is being proposed. The emerging STCI
> community and the OMPI community are
> not two non-overlapping groups, and run-time support we want to
> bring into OMPI is to support new
> functionality. The main point is that this is not STCI vs. OMPI at
> all.
> Rich
> >
> > My $0.0002 - hope it helps
> > Ralph
> >
> >
> > On Dec 4, 2008, at 6:00 PM, Richard Graham wrote:
> >
> > Let me start the e-mail conversation, and see how far we get.
> >
> > Goal: The goal several of us have is to be able to use the btl’s
> outside of the MPI layer in Open MPI. The layer itself is generic,
> w/o specific knowledge of Upper Level Protocols, so is well suited
> for this sort of use.
> >
> > Technical Approach: What we have suggested is to start the process
> with the Open MPI code base, and make it independent of the mpi-
> layer (which it is now), and the run-time layer.
> >
> > Before we get into any specific technical details,
> > the first question I have is are people totally opposed to the
> notion of making the btl’s independent of MPI and the run-time ?
> > This does not mean that it can’t be used by it, but that there
> are well defined abstraction layers, i.e., are people against the
> goal in the first place ?
> >
> > What are alternative suggestions to the technical approach ?
> >
> > One suggestion has been to branch and patch. To me this is a long-
> term maintenance nightmare.
> >
> > What are peoples thoughts here ?
> >
> > Rich
> >
> >
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> >
> >
> >
> > Ôøº
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> >
> >
> _______________________________________________
> devel mailing list
> devel_at_[hidden]

Jeff Squyres
Cisco Systems