Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] BTL move - the notion
From: Richard Graham (rlgraham_at_[hidden])
Date: 2008-12-05 11:10:59


>> > think we all agree that STCI and OMPI have different objectives and
>> requirements. OMPI is facing the need to launch and operate at extreme scales
>> by next summer, has received a lot of interest in having it report errors
>> into various systems, etc. We don't have all the answers as to what will be
>> necessary to meet these requirements, but indications so far are that tighter
>> integration, not deeper abstraction, between the various layers will be
>> needed. By that, I don't mean we will violate abstraction layers, but rather
>> that the various layers need to work more as a tightly tuned instrument, with
>> each layer operating based on a clear knowledge of how the other layers are
>> functioning.

OMPI and STCI are two different things together, and I have vested interest
in both, and have no desire
to have either go south. You have a set of requirement at LANL which are
important, and we also have a set of requirement at ORNL, and as such we
need to compromise on these
in the code base. We have MPI level goals, which will be accomplished in
the OMPI code base, and
tools and other related goals that will be accomplished in other code bases.
We both have the need to function well at the high end, so have the same set
of goals there.

>
> For example, for modex-less operations, the MPI/BTLs have to know that the
RTE/OS will be providing certain information. This means that they don't have to
go out and discover it themselves every time. Yes, we will leave that as the
default behavior so that small and/or unmanaged clusters can operate, but we
have to also introduce logic that can detect when we are utilizing this
alternative capability and exploit it. While we are trying our best to avoid
introducing RTE-like calls into the code, the fact is that we may well have to
do so (we have already identified one btl that will definitely need to). It is
simply too early to make the decision to cut that off now - we don't know what
the long-term impacts of such a decision will be.

This is where discussions will need to go both ways. Your changes also can
impact us, and we need to agree
to those changes, just as much as you need to agree with the changes we are
proposing. This is not a code
base focused on a single institution's requirements, and we all do our best
(and I believe tend to
succeed) at helping meet all of our needs.

>
> Finally, although I don't do much on the MPI layer, I am concerned about
performance. I would tend to oppose any additional abstraction until we can
measure the performance impact. Thus, I would like to see the BTL move done on a
tmp branch (technology to branch up to the implementer - I don't care) so we can
verify that it isn't hurting us in some unforeseeable manner.

Agreed - at least for the last phase of what we are suggesting, but we can
talk about this. I am a bit
confused about how the location of the source code has anything to do with
how it performs at run-time.
At this stage we have said nothing about changing the way the btl works,
just cosmetic things. When it
comes to enabling the use of stci with ompi, then these issues will come up,
and need to be addressed
very carefully. To be honest, since we don't want to change the btl's
(aside from add some attributes)
I don't expect this to be an issue, UNLESS we end up needing to change some
data structures for abstraction
purposes. This is where we need to be very careful. If you look at what
has happened with the btl's
(actually first the PTL's) historically, I have been one of the ones pushing
hard for improved performance -
why would this change now ?

>
>
> So I guess my concerns really boil down to dealing with conflicting schedules
and requirements, how to support multiple possibly competing groups that want to
share one or more parts of our code base, and retaining an OMPI-first philosophy
when it comes to what changes get made. My proposed solution is:

This is the problem we face all the time, and on a regular basis we as a
community do our best to help
each other out. This is one of the reasons 1.3 is as late as it is, and
this is a good thing that will
continue as long as this is a community project.

>
> 1. shift our repository to a technical solution that supports broader code
sharing
>
> 2. have the non-OMPI groups access our code base via that technology. They can
"pull" changes at will, subject to the licensing agreement. It is true that they
may have to do some local editing if the change hits a spot where they have
local mods to support their system, but both Hg and GIT are very good at
handling this - much better than svn ever has been.
>
> 3. if there are minor mods required to make the BTL code area easier to share
via the above methods, then we should explore and implement them. Certainly,
renaming #define values would seem a no-brainer. I suspect there are other
similar things that could be done. Removing orte/opal dependencies is more
controversial and would need to thoroughly be examined.
>
> 4. OMPI decides what changes get made to its code base. We are polite about it
and talk to the other groups to try and minimize impact, but ultimately we do
what is best for OMPI, and send out notifications (perhaps a new mailing list
specifically for that purpose) when changes occur. Note that this would have
helped the Eclipse group enormously as otherwise they drown in the devel list
trying to spot the changes.

I don't see that anything else is being proposed. The emerging STCI
community and the OMPI community are
not two non-overlapping groups, and run-time support we want to bring into
OMPI is to support new
functionality. The main point is that this is not STCI vs. OMPI at all.

Rich

>
> My $0.0002 - hope it helps
> Ralph
>
>
> On Dec 4, 2008, at 6:00 PM, Richard Graham wrote:
>
> Let me start the e-mail conversation, and see how far we get.
>
> Goal: The goal several of us have is to be able to use the btl’s outside of
the MPI layer in Open MPI. The layer itself is generic, w/o specific knowledge
of Upper Level Protocols, so is well suited for this sort of use.
>
> Technical Approach: What we have suggested is to start the process with the
Open MPI code base, and make it independent of the mpi-layer (which it is now),
and the run-time layer.
>
> Before we get into any specific technical details,
> the first question I have is are people totally opposed to the notion of
making the btl’s independent of MPI and the run-time ?
> This does not mean that it can’t be used by it, but that there are well
defined abstraction layers, i.e., are people against the goal in the first place
?
>
> What are alternative suggestions to the technical approach ?
>
> One suggestion has been to branch and patch. To me this is a long-term
maintenance nightmare.
>
> What are peoples thoughts here ?
>
> Rich
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> Ôøº
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>