Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Preparations for moving the btl's
From: Ralph Castain (rhc_at_[hidden])
Date: 2008-12-04 09:04:39


Yes, FTB utilizes the notifier framework. In addition, we have three
other components getting ready to be added to that framework that will
provide interfaces to Moab, SLURM, and a DOE monitoring program. The
first two will require messaging capabilities to tell the schedulers
about problem nodes/routes. The latter will also use a messaging
protocol, but is mostly aimed at alerting operators to a problem and
creating a historical archive.

  That said, we can expect the use of orte_notifier to spread across
the BTL's pretty aggressively in the next few months, and for the
notifier API to change/expand as we address these needs.

On Dec 4, 2008, at 6:13 AM, Jeff Squyres wrote:

> I think you got it right. And I think we're pretty good in terms of
> BTL usage of ORTE and OPAL (to include the new "notifier" service
> that Ralph put in recently -- what the FTB will likely eventually
> use, I think...?); those interfaces and abstraction barriers are
> technologically enforced. If you break the abstractions, the linker
> will swiftly and unmercifully punish you. (this was exactly [one
> of] the rationale that we used for splitting the code base into
> OPAL, ORTE, and OMPI several years ago)
>
> Greg has already noted on the wiki a few constants used in the BTL's
> that have an OMPI_ prefix that aren't really OMPI values (e.g.,
> OMPI_ENABLE_HETEROGENEOUS_SUPPORT). These come from configure
> (i.e., opal/include/opal_config.h) and were not renamed back when we
> split the code base into OPAL, ORTE, and OMPI. I don't think we had
> a strong reason for not renaming them -- most could probably be
> renamed to OPAL_* -- we just didn't do it then. Perhaps they can be
> changed during the BTL extraction process (I noted this on the wiki).
>
>
>
> On Dec 3, 2008, at 9:43 PM, Richard Graham wrote:
>
>> BTW,
>> I was guessing FTB is Fault Tolerant Backbone, but if not, can
>> someone tell me what it is ? If it is not the later, what I just
>> wrote about it makes no sense.
>>
>> Rich
>>
>>
>> On 12/3/08 9:34 PM, "Richard Graham" <rlgraham_at_[hidden]> wrote:
>>
>>> The goal is to use the btl’s outside of the context of MPI, which
>>> was what was in mind from the day the ompi work started over five
>>> years ago, but with no other use at the time, things grew up
>>> intermingled – no surprise at all. What we are attempting to do
>>> is to untangle the existing dependencies, and make a much cleaner
>>> distinction between how/what data is passed between layers.
>>>
>>> I expect this will involve some sort of well defined interface
>>> between the btl’s and orte, and I don’t know if this will also
>>> require something like this between the btl’s and the pml – I
>>> think that interface is rigidly enforced, but am not sure.
>>>
>>> I expect that explicit calls to FTB in the btl layer would have to
>>> be componentized, especially in the context of what is developing
>>> in the FT working group of the MPI Forum. Not that FTB is bad in
>>> any way, just that it is one of many monitors.
>>>
>>> We will need to talk about this on a case by case basis, and
>>> decide how to proceed. If anyone wants to help, please do.
>>>
>>> Rich
>>>
>>>
>>> On 12/3/08 3:02 PM, "Ralph Castain" <rhc_at_[hidden]> wrote:
>>>
>>>> I managed to execute the modex-less changes pretty much without
>>>> introducing additional ORTE dependencies into the BTL's, though
>>>> there
>>>> may be some additions as we look a the other BTLs that I didn't
>>>> address. So hopefully that won't contribute too much to the issue
>>>> here.
>>>>
>>>> At the moment, I don't think it matters where notifier sits - it
>>>> might
>>>> be able to move to OPAL. Only catch will be if some notifier
>>>> component
>>>> requires communications. I'm thinking of FTB, for example, and
>>>> our own
>>>> local monitoring program that may require TCP messaging. We don't
>>>> currently have anything in OPAL that would support an OPAL level
>>>> messaging system, though perhaps that could be resolved.
>>>>
>>>> We also have dependencies where the BTL's will call orte_ess to
>>>> find
>>>> out what node another proc is on, the node local rank of that proc,
>>>> etc. Those dependencies are likely to grow after the Dec meeting
>>>> (see
>>>> wiki for that agenda item), and definitely cannot be moved to OPAL.
>>>>
>>>> However, note that Rich stated the BTL's were -not- moving to OPAL.
>>>> This begs the question: where -are- they going? Into their own
>>>> layer?
>>>> Will that layer be somewhere in-between OMPI and ORTE (in which
>>>> case,
>>>> the ORTE dependencies are moot)?
>>>>
>>>> I note that the wiki page doesn't address any of these questions,
>>>> which is understandable if things are just getting underway. But it
>>>> does sound like this is going to take some thought to ensure we
>>>> don't
>>>> paint ourselves into a corner.
>>>>
>>>> Ralph
>>>>
>>>>
>>>> On Dec 3, 2008, at 12:10 PM, Jeff Squyres wrote:
>>>>
>>>> > FWIW, I see lots of notifier calls being added to the BTLs (and
>>>> > elsewhere throughout the OMPI code base) over time...
>>>> >
>>>> > On Dec 3, 2008, at 2:07 PM, Tim Mattox wrote:
>>>> >
>>>> >> The BTLs might have added calls to the notifier framework in
>>>> their
>>>> >> error paths.
>>>> >> The notifier framework is currently in the ORTE layer... not
>>>> sure
>>>> >> if we could
>>>> >> move it down to OPAL. Ralph, any thoughts on that?
>>>> >>
>>>> >> On Wed, Dec 3, 2008 at 11:56 AM, Richard Graham <rlgraham_at_[hidden]
>>>> >
>>>> >> wrote:
>>>> >>> George told me about what he is doing, so no changes would be
>>>> >>> committed
>>>> >>> until George has his changes in.
>>>> >>>
>>>> >>> Are there other changes to the btl's that we should be aware
>>>> of ?
>>>> >>>
>>>> >>> Rich
>>>> >>>
>>>> >>>
>>>> >>> On 12/3/08 11:47 AM, "George Bosilca" <bosilca_at_[hidden]>
>>>> wrote:
>>>> >>>
>>>> >>>> Terry,
>>>> >>>>
>>>> >>>> I'm involved [at some degree] in both efforts and I can
>>>> confirm
>>>> >>>> these
>>>> >>>> two efforts will not affect each other in any bad way.
>>>> >>>>
>>>> >>>> george.
>>>> >>>>
>>>> >>>> On Dec 3, 2008, at 11:42 , Terry Dontje wrote:
>>>> >>>>
>>>> >>>>> I don't have any *strong* objections. However, I know that
>>>> Eugene
>>>> >>>>> and George B have been working on some Fastpath code changes
>>>> >>>>> that we
>>>> >>>>> should make sure neither project obliterates the other.
>>>> >>>>>
>>>> >>>>> --td
>>>> >>>>>
>>>> >>>>> Richard Graham wrote:
>>>> >>>>>> Now that 1.3 will be released, we would like to go ahead
>>>> with the
>>>> >>>>>> plan to move the btl’s out of the MPI layer. Greg Koenig
>>>> who is
>>>> >>>>>> doing most of the work has started a wiki page with
>>>> details on
>>>> >>>>>> the
>>>> >>>>>> plans. Right now details are sketchy, as Greg is digging
>>>> through
>>>> >>>>>> the code, and has only hand written notes on data
>>>> structures that
>>>> >>>>>> need to be moved, include files that are not needed, etc.
>>>> The
>>>> >>>>>> page
>>>> >>>>>> is at:
>>>> >>>>>> _https://svn.open-mpi.org/trac/ompi/wiki/BTLExtraction_
>>>> >>>>>>
>>>> >>>>>> The first three steps basically only involve code motion,
>>>> moving
>>>> >>>>>> items such as ompi_list, and renaming them, moving where
>>>> the code
>>>> >>>>>> is actually located in the repository, and the like. For
>>>> these we
>>>> >>>>>> do not plan to put out a formal RFC, but comments are very
>>>> >>>>>> welcome,
>>>> >>>>>> and any hands that are willing to help with this are even
>>>> more
>>>> >>>>>> welcome.
>>>> >>>>>>
>>>> >>>>>> The last phase where the btl’s are made dependent on OPAL,
>>>> and
>>>> >>>>>> supporting libraries such as mpools I expect will be
>>>> disruptive,
>>>> >>>>>> and will definitely require an RFC, and will also be a
>>>> longer
>>>> >>>>>> process.
>>>> >>>>>>
>>>> >>>>>> Please send comments,
>>>> >>>>>> Rich
>>>> >>>>>>
>>>> ------------------------------------------------------------------------
>>>> >>>>>>
>>>> >>>>>> _______________________________________________
>>>> >>>>>> devel mailing list
>>>> >>>>>> devel_at_[hidden]
>>>> >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>> >>>>>>
>>>> >>>>>
>>>> >>>>> _______________________________________________
>>>> >>>>> devel mailing list
>>>> >>>>> devel_at_[hidden]
>>>> >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>> >>>>
>>>> >>>>
>>>> >>>> _______________________________________________
>>>> >>>> devel mailing list
>>>> >>>> devel_at_[hidden]
>>>> >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>> >>>
>>>> >>>
>>>> >>> _______________________________________________
>>>> >>> devel mailing list
>>>> >>> devel_at_[hidden]
>>>> >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>> >>>
>>>> >>
>>>> >>
>>>> >>
>>>> >> --
>>>> >> Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/
>>>> >> tmattox_at_[hidden] || timattox_at_[hidden]
>>>> >> I'm a bright... http://www.the-brights.net/
>>>> >>
>>>> >> _______________________________________________
>>>> >> devel mailing list
>>>> >> devel_at_[hidden]
>>>> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>> >
>>>> >
>>>> > --
>>>> > Jeff Squyres
>>>> > Cisco Systems
>>>> >
>>>> >
>>>> > _______________________________________________
>>>> > devel mailing list
>>>> > devel_at_[hidden]
>>>> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> --
> Jeff Squyres
> Cisco Systems
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel