Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RFC: move BTLs out of ompi into separate layer
From: Rainer Keller (keller_at_[hidden])
Date: 2009-03-09 17:07:13


Hi Jeff,
thanks for the mail!
I completely agree with Your points.

To stress the fact: The timeout date does not mean, that we intend to just
commit to trunk by that date.
It was rather to get comments to this particular date by all the parties
interested. (this is what I remembered from previous RFCs, but I could be
wrong...)
All the work that has been committed should cleanup the code. Anything that
was beyond a cleanup deserved an RFC and input from many people (such as
bitmap_t change...).

We still intend, as in the Louisville meeting, to have as much input from the
community (that's why this is TRACS-visible svn-tmp-branch).

Thanks,
Rainer

On Monday 09 March 2009 04:52:28 pm Jeff Squyres wrote:
> Random points in no particular order (Rainer please correct me if I'm
> making bad assumptions):
>
> - I believe that ORNL is proposing to do this work on a separate
> branch (this is what we have discussed for some time now, and we
> discussed this deeply in Louisville). The RFC text doesn't
> specifically say, but I would be very surprised if this stuff is
> planned to come back to the trunk in the near future -- as we have all
> agreed, it's not done yet.
>
> - I believe that the timeout field in RFC's is a limit for non-
> responsiveness -- it is mainly intended to prevent people from
> ignoring / not responding to RFCs. I do not believe that Rainer was
> using that date as a "that's when I'm bringing it all back to the
> trunk." Indeed, he specifically called out the 1.5 series as a target
> for this work.
>
> - I also believe that Rainer is using this RFC as a means to get
> preliminary review of the work that has been done on the branch so
> far. He has provided a script that shows what they plan to do, how
> the code will be laid out, etc. There are still some important core
> issues to be solved -- and, like Brian, I want to see how they'll get
> solved before being happy (we have strong precedent for this
> requirement) -- but I think all that Rainer was saying in his RFC was
> "here's where we are so far; can people review and see if they hate it?"
>
> - It was made abundantly clear in the Louisville meeting that ORTE has
> no short-term plans for using the ONET layer (probably no long-term
> plans, either, but hey -- never say "never" :-) ). The design of ONET
> is such that other RTE's *could* use ONET if they want (e.g., STCI
> will), but it is not a requirement for the underlying RTE to use
> ONET. We agreed in Louisville that ORTE will provide sufficient stubs
> and hooks (all probably effectively no-ops) so that ONET can compile
> against it in the default OMPI configuration; other RTEs that want to
> do more meaningful stuff will need to provide more meaningful
> implementations of the stubs and hooks.
>
> - Hopefully the teleconference time tomorrow works out for Rich (his
> communications were unclear on this point). Otherwise, postponing the
> admin discussion until April seems problematic.
>
> On Mar 9, 2009, at 4:01 PM, Brian W. Barrett wrote:
> > I, not suprisingly, have serious concerns about this RFC. It
> > assumes that
> > the ompi_proc issues and bootstrapping issues (the entire point of the
> > move, as I understand it) can both be solved, but offer no proof to
> > support that claim. Without those two issues solved, we would be left
> > with an onet layer that is dependent on ORTE and OMPI, and which OMPI
> > depends upon. This is not a good place to be. These issues should be
> > resolved before an onet layer is created in the trunk.
> >
> > This is not an unusual requirement. The fault tolerance work took a
> > very
> > long time because of similar requirements. Not only was a full
> > implementation required to prove performance would not be negatively
> > impacted (when FT wasn't active), but we had discussions about its
> > impact
> > on code maintainability. We had a full implementation of all the
> > pieces
> > that impacted the code *before* any of it was allowed into the trunk.
> >
> > We should live by the rules the community has setup. They have
> > served us
> > well in the past. Further, these are not new objections on my part.
> > Since the initial RFCs related to this move started, I have
> > continually
> > brought up the exact same questions and never gotten a satisfactory
> > answer. This RFC even acknowledges the issues, but without
> > presenting any
> > solution and still asks to do the most disruptive work. I simply
> > can't
> > see how that fits with Open MPI's long-standing development
> > proceedures.
> >
> > If all the issues I've asked about previously (which are essentially
> > the
> > ones you've identified in the RFC) can be solved, the impact to code
> > base
> > maintainability is reasonable, and the impact to performance is
> > negligable, I'll gladly remove my objection to this RFC.
> >
> > Further, before any work on this branch is brought into the trunk, the
> > admin-level discussion regarding this issue should be resolved. At
> > this
> > time, that discussion is blocking on ORNL and they've given April as
> > the
> > earliest such a discussion can occur. So at the very least, the RFC
> > timeout should be pushed into April or ORNL should revise their
> > availability for the admin discussion.
> >
> >
> > Brian
> >
> > On Mon, 9 Mar 2009, Rainer Keller wrote:
> > > What: Move BTLs into separate layer
> > >
> > > Why: Several projects have expressed interest to use the
> >
> > BTLs. Use-cases
> >
> > > such as the RTE using the BTLs for modex or tools collecting/
> >
> > distributing data
> >
> > > in the fastest possible way may be possible.
> > >
> > > Where: This would affect several components, that the BTLs
> >
> > depend on
> >
> > > (namely allocator, mpool, rcache and the common part of the BTLs).
> > > Additionally some changes to classes were/are necessary.
> > >
> > > When: Preferably 1.5 (in case we use the Feature/Stable
> >
> > Release cycle ;-)
> >
> > > Timeout: 23.03.2009
> >
> > ------------------------------------------------------------------------
> >
> > > There has been much speculation about this project.
> > > This RFC should shed some light, if there is some more information
> >
> > required,
> >
> > > please feel free to ask/comment. Of course, suggestions are welcome!
> > >
> > > The BTLs offer access to fast communication framework. Several
> >
> > projects have
> >
> > > expressed interest to use them separate of other layers of Open MPI.
> > > Additionally (with further changes) BTLs maybe used within ORTE
> >
> > itself.
> >
> > > COURSE OF WORK:
> > > The extraction is not easy (as was the extraction of ORTE and OMPI
> >
> > in the
> >
> > > early stages of Open MPI?).
> > > In order to get as much input and be as visible as possible (e.g.
> >
> > in TRACS),
> >
> > > the tmp-branch for this work has been set up on:
> > > https://svn.open-mpi.org/svn/ompi/tmp/koenig-btl
> > >
> > > We propose to have a separate ONET library living in onet, based
> >
> > on orte (see
> >
> > > attached fig).
> > >
> > > In order to keep the diff between the trunk and the branch to a
> >
> > minimum
> >
> > > several cleanup patches have already been applied to the trunk (e.g.
> > > unnecessary #include of ompi and orte header files, integration of
> > > ompi_bitmap_t into opal_bitmap_t, #include "*_config.h").
> > >
> > >
> > > Additionally a script (attached below) has been kept up-to-date
> >
> > (contrib/move-
> >
> > > btl-into-onet), that will perform this separation on a fresh
> >
> > checkout of
> >
> > > trunk:
> > > svn list
> > > https://svn.open-mpi.org/svn/ompi/tmp/koenig-btl/contrib/move-btl-
> > > into-onet
> > >
> > > This script requires several patches (see attached TAR-ball).
> > > Please update the variable PATCH_DIR to match the location of
> >
> > patches.
> >
> > > ./move-btl-into-onet ompi-clean/
> > > # Lots of output deleted.
> > > cd ompi-clean/
> > > rm -fr ompi/mca/common/ # No two mcas called common, too bad...
> > > ./autogen.sh
> > >
> > >
> > > OTHER RTEs:
> > > A preliminary header file is provided in onet/include/rte.h to
> >
> > accommodate the
> >
> > > requirements of other RTEs (such as stci), that replaces selected
> > > functionality, as proposed by Jeff and Ralph in the Louisville
> >
> > meeting.
> >
> > > Additionally, this header file is included before orte-header
> >
> > files (within
> >
> > > onet)...
> > > By default, this does not change anything in the standard case
> >
> > (ORTE),
> >
> > > otherwise -DHAVE_STCI, redefinitions for components orte-
> >
> > functionality
> >
> > > required within onet is done.
> > >
> > >
> > > TESTS:
> > > First tests have been done locally on Linux/x86_64.
> > > The branch compiles without warnings.
> > > The wrappers have been updated.
> > >
> > > The Intel Testsuite runs without failures:
> > > ./run-tests.pl all_tests_no_perf
> > >
> > >
> > > PERFORMANCE:
> > > !!!Before any merge, do extensive performance tests on real
> >
> > machines!!!
> >
> > > Initial tests on the cluster smoky, show no difference in
> >
> > comparison to ompi-
> >
> > > trunk.
> > > Please see the enclosed output of NetPipe-3.7.1 run on a single
> >
> > node (--mca
> >
> > > btl sm,self) on smoky.
> > >
> > >
> > > TODOS:
> > > There are still some todos, to finalize this:
> > > - Dependencies in the onet-layer into the ompi-layer (ompi_proc_t,
> > > ompi_converter)
> > > We are working on these, and have shortly talked about the latter
> >
> > with
> >
> > > George.
> > > - Better abstraction from orte / cleanups, such as modex
> > >
> > > If these involve code-changes (and not just "save" and non-
> >
> > intrusive renames),
> >
> > > such as a opal_keyval-change, we will continue to write RFCs.
> >
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
------------------------------------------------------------------------
Rainer Keller, PhD                  Tel: +1 (865) 241-6293
Oak Ridge National Lab          Fax: +1 (865) 241-4811
PO Box 2008 MS 6164           Email: keller_at_[hidden]
Oak Ridge, TN 37831-2008    AIM/Skype: rusraink