I'll answer this outside of Terry's reply so we can stay under George's page limit. :-))

I don't have any philosophical opposition to the idea. Indeed, there are places where I would potentially have some use for the btl's, perhaps as an alternative comm channel in the OOB. I will point out, though, that there are several things we thought when we started this project that have proven unworkable over time. For example, the idea that the RTE could be a general purpose one without impacting OMPI proved incorrect and has been abandoned. It may well be that the notion of using the BTL's for non-OMPI projects will fall into that category as well - not saying it does, but I think it is still TBD.

That said, I do have some significant concerns about -how- this is done that fall into two categories:

1. Procedural
Keeping the common code in the OMPI repository can raise quite a bit of trouble with synchronizing release cycles. We are just about to exit a period of requested "quiet" time on the trunk to stabilize it for the 1.3 release. If STCI is in an active development phase, this could have caused a major problem as we would have demanded they not commit to our code repository. It is easy to foresee the reverse situation. Indeed, from working on several other similar projects, this problem is not only common, but frequent. How do we intend to work this out?

I am also concerned about slowing down OMPI's development efforts due to the need to coordinate proposed changes with an even broader community, and one that will have conflicting requirements/schedules. We already have problems getting people to stay adequately involved as changes are proposed and made, especially as the communities members have become involved in other efforts over time. It would become unworkable if we take months to touch base with everyone who might be impacted and get general consensus on changes required by OMPI. As Terry said, we have to maintain OMPI's agility.

We all need to keep something in mind here. While this discussion is about the BTL's and coordinating with STCI, we are talking about a general method of operation that will have to be extended to anyone with a similar request. There already are other groups out there, some competing with STCI, that have issued similar requests for sharing various pieces of the code base (the ones coming to me mostly pertain to the RTE). So whatever we do should be generalizable - it can't just be a point solution for STCI.

I am disturbed by the immediate rejection of methods developed and used by other large code projects that address this very problem. Both Hg and GIT were developed specifically with this code sharing synchronization issue in mind, and have enjoyed rapid adoption and get rave reviews for their solutions. It provides maximum flexibility, but requires a bit of a learning curve and admittedly more attention to maintenance details. However, other projects in similar circumstances have found it highly beneficial. I would think we should at least consider what is becoming the state-of-the-art method for code sharing before simply rejecting this approach as too much maintenance.


2. Technical
I think we all agree that STCI and OMPI have different objectives and requirements. OMPI is facing the need to launch and operate at extreme scales by next summer, has received a lot of interest in having it report errors into various systems, etc. We don't have all the answers as to what will be necessary to meet these requirements, but indications so far are that tighter integration, not deeper abstraction, between the various layers will be needed. By that, I don't mean we will violate abstraction layers, but rather that the various layers need to work more as a tightly tuned instrument, with each layer operating based on a clear knowledge of how the other layers are functioning.

For example, for modex-less operations, the MPI/BTLs have to know that the RTE/OS will be providing certain information. This means that they don't have to go out and discover it themselves every time. Yes, we will leave that as the default behavior so that small and/or unmanaged clusters can operate, but we have to also introduce logic that can detect when we are utilizing this alternative capability and exploit it. While we are trying our best to avoid introducing RTE-like calls into the code, the fact is that we may well have to do so (we have already identified one btl that will definitely need to). It is simply too early to make the decision to cut that off now - we don't know what the long-term impacts of such a decision will be.

Finally, although I don't do much on the MPI layer, I am concerned about performance. I would tend to oppose any additional abstraction until we can measure the performance impact. Thus, I would like to see the BTL move done on a tmp branch (technology to branch up to the implementer - I don't care) so we can verify that it isn't hurting us in some unforeseeable manner.


So I guess my concerns really boil down to dealing with conflicting schedules and requirements, how to support multiple possibly competing groups that want to share one or more parts of our code base, and retaining an OMPI-first philosophy when it comes to what changes get made. My proposed solution is:

1. shift our repository to a technical solution that supports broader code sharing

2. have the non-OMPI groups access our code base via that technology. They can "pull" changes at will, subject to the licensing agreement. It is true that they may have to do some local editing if the change hits a spot where they have local mods to support their system, but both Hg and GIT are very good at handling this - much better than svn ever has been.

3. if there are minor mods required to make the BTL code area easier to share via the above methods, then we should explore and implement them. Certainly, renaming #define values would seem a no-brainer. I suspect there are other similar things that could be done. Removing orte/opal dependencies is more controversial and would need to thoroughly be examined.

4. OMPI decides what changes get made to its code base. We are polite about it and talk to the other groups to try and minimize impact, but ultimately we do what is best for OMPI, and send out notifications (perhaps a new mailing list specifically for that purpose) when changes occur. Note that this would have helped the Eclipse group enormously as otherwise they drown in the devel list trying to spot the changes.

My $0.0002 - hope it helps
Ralph


On Dec 4, 2008, at 6:00 PM, Richard Graham wrote:

Let me start the e-mail conversation, and see how far we get.

Goal: The goal several of us have is to be able to use the btlís outside of the MPI layer in Open MPI.  The layer itself is generic, w/o specific knowledge of Upper Level Protocols, so is well suited for this sort of use.

Technical Approach: What we have suggested is to start the process with the Open MPI code base, and make it independent of the mpi-layer (which it is now), and the run-time layer.

Before we get into any specific technical details,
the first question I have is are people totally opposed to the notion of making the btlís independent of MPI and the run-time ?
This does not mean that it canít be used by it, but that there are well defined abstraction layers, i.e., are people against the goal in the first place ?

What are alternative suggestions to the technical approach ?

One suggestion has been to branch and patch.  To me this is a long-term maintenance nightmare.

What are peoples thoughts here ?

Rich

_______________________________________________
devel mailing list
devel@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel