I think your proposed approach is an excellent one! I know it will take work
to implement, which raises its own issues, but I do believe that it is the
only real long-term solution.
Just my $0.002. I would be willing to help with implementation, if that
would be of use. Not sure I understand the build system well enough to just
do it, I fear.
On 2/7/08 9:34 AM, "Jeff Squyres" <jsquyres_at_[hidden]> wrote:
> All these comments are good. I confess that although I should have, I
> really did not previously consider the complexity of adding in N
> contrib packages to OMPI.
> The goal of the contrib packages is to easily allow additional
> functionality that is nicely integrated with Open MPI. An obvious way
> to do this is to include the code in the Open MPI tarball, but that
> leads to the logistics and other issues that have been identified.
> Ralph proposes a good way around this. But what about going farther
> than that: what we if we offer a standardized set of hooks for
> including contrib functionality *after* core OMPI has been installed?
> Yes, it's one more step after OMPI has been installed -- but if we can
> keep it as *one* step, perhaps the user onus is not that bad. Let me
> Consider a new standalone executable: ompi_contrib. You would run
> ompi_contrib to install and uninstall contrib functionality into your
> existing OMPI:
> ompi_contrib --install http://www.example.com/nbc/nbc-ompi-contrib.tar.gz
> or ompi_contrib --install file:///home/htor/nbc-ompi-contrib.tar.gz
> This will download NBC (if http), build it, and install it into the
> current OMPI. It is likely that the nbc-ompi-contrib.tar.gz file will
> contain the real NBC tarball (or maybe just a reference to it?) plus a
> small number of hook/glue scripts for OMPI integration (perhaps quite
> similar to what is in the contrib/ tree [on the branch] today for
> NBC?). Likewise, after NBC is installed into the local OMPI
> installation, ompi_info should be able to show "nbc" as installed
> contrib functionality. It then follows that we might be able to do:
> ompi_contrib --uninstall nbc
> to uninstall contrib NBC from the local OMPI installation.
> This kind of approach would seem to have several benefits:
> - Keep a clear[er] distinction between core OMPI and contributed
> - Allow simple integration of MPI libraries, tools, and even
> applications (!) (think: numerical libraries, boost C++ libraries,
> etc. -- how many of your users install additional tools on top of MPI
> incorrectly?). Anything
> - Allow 3rd parties to have "contrib" code to Open MPI without needing
> to get into our code tree (and sign the 3rd party agreements, etc.),
> keeping our distribution size down, avoiding release schedule
> logistical issues, keeping our "core" build time down, etc.
> - Allow integration of contrib functionality at both a per-user and
> system-wide basis.
> What I'm really proposing here is that OMPI becomes a system that can
> have additional functionality installed / uninstalled. Based on the
> infrastructure that we already have, this is not as much of a stretch
> as one would think.
> ("who's going to write this" is a question that will also have to be
> answered, but perhaps we can discuss the code concept/idea first...)
> On Feb 7, 2008, at 10:11 AM, Ralph H Castain wrote:
>> I believe Brian and Terry raise good points. May I offer a possible
>> alternative? What if we only include in Open MPI an include file that
>> contains the "hooks" to libNBC, and have the build system only "see"
>> if someone specifies --with-NBC (or whatever option name you like).
>> If you
>> like, you can make the inclusion automatic if libNBC is detected on
>> system. It would make sense to also add -libNBC to the mpicc et al
>> as well when the build system includes the function definitions.
>> This would allow those users that want (or can) to use that library
>> against it, without adding a bunch of source code to our release. I
>> there are complications that will have to be dealt with, but offer
>> it as
>> something to consider.
>> Also, remember that there is an added burden when we add source code
>> to Open
>> MPI that we haven't discussed - we are now adding coordination
>> issues to our
>> own release cycle. If libNBC changes, are we now going to be pressed
>> issue another OMPI release so that the new NBC version is included?
>> Do we
>> now need to coordinate our releases with theirs so that things align?
>> And if we have an increasing number of such "included" packages, how
>> is -that- release discussion going to get?!?
>> On 2/7/08 4:48 AM, "Terry Dontje" <Terry.Dontje_at_[hidden]> wrote:
>>> Torsten Hoefler wrote:
>>>> Hi Brian,
>>>>> Let me start by reminding everyone that I have no vote, so this
>>>>> probably be sent to /dev/null.
>>>> thanks for your comment and this will not go to /dev/null!
>>>>> I think Ralph raised some good points. I'd like to raise another.
>>>> yes [will reply to this in a separate thread]
>>>>> Does it make sense to bring LibNBC into the release at this point,
>>>>> given the current standardization process of non-blocking
>>>>> My feeling is no, based on the long term support costs. We had
>>>>> problem with a function in LAM/MPI -- MPIL_SPAWN, I believe it
>>>>> was --
>>>>> that was almost but not quite MPI_COMM_SPAWN. It was added to
>>>>> spawn before the standard was finished for dynamics. The problem
>>>>> it wasn't quite MPI_COMM_SPAWN, so we were now stuck with yet
>>>>> function to support (in a touchy piece of code) for infinity and
>>>>> I worry that we'll have the same with LibNBC -- a piece of code
>>>>> solves an immediate problem (no non-blocking collectives in MPI)
>>>>> will become a long-term support anchor. Since this is something
>>>>> be encouraging users to write code to, it's not like support for
>>>>> mvapi, where we can just deprecate it and users won't really
>>>>> It's one thing to tell them to update their cluster software
>>>>> stack --
>>>>> it's another to tell them to rewrite their applications.
>>>> I think this is a very good and valid point. However, I would like
>>>> deprecate the NBC_* things as soon as non-blocking collectives are a
>>>> part of the standard. Of course, this would probably need two minor
>>>> versions to "clean" the code-base, but this is (will be) our normal
>>>> procedure (just what happened to MVAPI).
>>> Though it doesn't seem to me that NBC is a slam dunk to get into
>>> the MPI
>>> spec and I could
>>> imagine it changing significantly due to someone elses opinion/needs.
>>>> And rewriting the user's application will not be that hard, it'll
>>>> be vim:%s/NBC_/MPI_/g. Even if we change the interface (e.g. add
>>>> tags or
>>>> decide to use the more limited split collective approach), this
>>>> task is
>>>> rather easy and can be automated easily. It's not a functionality
>>>> change, just an interface.
>>> Though if NBC is built by default for release builds I think that
>>> the bar saying that we
>>> OMPI believe this should be used by all of our users without any
>>> concerns that the API may
>>> change or it might have significant issues.
>>> On a similar track do you have any tests that validate the
>>> functionality/correctness of NBC
>>> that can be ran as a part of the MTT nightly tests?
>>> My opinion is I have no problem with NBC being merged in just that I
>>> don't think it should be
>>> built by default.
>>> devel mailing list
>> devel mailing list