Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Preparations for moving the btl's
From: Richard Graham (rlgraham_at_[hidden])
Date: 2008-12-04 12:37:36


Here is where I think we should reconsider accessing the notifier component
in the btl. It creates dependencies in the btl that are not needed. The
idea of a notifier component is a good one, but I would defer using it to
upper layers, rather than embedding it in the guts of the communication
system. I would be in favor of an approach that sends the information up
the call stack. The btl¹s should not depend on other communication
primitives, as they are the communication primitive.

Rich

On 12/4/08 9:04 AM, "Ralph Castain" <rhc_at_[hidden]> wrote:

> Yes, FTB utilizes the notifier framework. In addition, we have three
> other components getting ready to be added to that framework that will
> provide interfaces to Moab, SLURM, and a DOE monitoring program. The
> first two will require messaging capabilities to tell the schedulers
> about problem nodes/routes. The latter will also use a messaging
> protocol, but is mostly aimed at alerting operators to a problem and
> creating a historical archive.
>
> That said, we can expect the use of orte_notifier to spread across
> the BTL's pretty aggressively in the next few months, and for the
> notifier API to change/expand as we address these needs.
>
> On Dec 4, 2008, at 6:13 AM, Jeff Squyres wrote:
>
>> > I think you got it right. And I think we're pretty good in terms of
>> > BTL usage of ORTE and OPAL (to include the new "notifier" service
>> > that Ralph put in recently -- what the FTB will likely eventually
>> > use, I think...?); those interfaces and abstraction barriers are
>> > technologically enforced. If you break the abstractions, the linker
>> > will swiftly and unmercifully punish you. (this was exactly [one
>> > of] the rationale that we used for splitting the code base into
>> > OPAL, ORTE, and OMPI several years ago)
>> >
>> > Greg has already noted on the wiki a few constants used in the BTL's
>> > that have an OMPI_ prefix that aren't really OMPI values (e.g.,
>> > OMPI_ENABLE_HETEROGENEOUS_SUPPORT). These come from configure
>> > (i.e., opal/include/opal_config.h) and were not renamed back when we
>> > split the code base into OPAL, ORTE, and OMPI. I don't think we had
>> > a strong reason for not renaming them -- most could probably be
>> > renamed to OPAL_* -- we just didn't do it then. Perhaps they can be
>> > changed during the BTL extraction process (I noted this on the wiki).
>> >
>> >
>> >
>> > On Dec 3, 2008, at 9:43 PM, Richard Graham wrote:
>> >
>>> >> BTW,
>>> >> I was guessing FTB is Fault Tolerant Backbone, but if not, can
>>> >> someone tell me what it is ? If it is not the later, what I just
>>> >> wrote about it makes no sense.
>>> >>
>>> >> Rich
>>> >>
>>> >>
>>> >> On 12/3/08 9:34 PM, "Richard Graham" <rlgraham_at_[hidden]> wrote:
>>> >>
>>>> >>> The goal is to use the btl¹s outside of the context of MPI, which
>>>> >>> was what was in mind from the day the ompi work started over five
>>>> >>> years ago, but with no other use at the time, things grew up
>>>> >>> intermingled ­ no surprise at all. What we are attempting to do
>>>> >>> is to untangle the existing dependencies, and make a much cleaner
>>>> >>> distinction between how/what data is passed between layers.
>>>> >>>
>>>> >>> I expect this will involve some sort of well defined interface
>>>> >>> between the btl¹s and orte, and I don¹t know if this will also
>>>> >>> require something like this between the btl¹s and the pml ­ I
>>>> >>> think that interface is rigidly enforced, but am not sure.
>>>> >>>
>>>> >>> I expect that explicit calls to FTB in the btl layer would have to
>>>> >>> be componentized, especially in the context of what is developing
>>>> >>> in the FT working group of the MPI Forum. Not that FTB is bad in
>>>> >>> any way, just that it is one of many monitors.
>>>> >>>
>>>> >>> We will need to talk about this on a case by case basis, and
>>>> >>> decide how to proceed. If anyone wants to help, please do.
>>>> >>>
>>>> >>> Rich
>>>> >>>
>>>> >>>
>>>> >>> On 12/3/08 3:02 PM, "Ralph Castain" <rhc_at_[hidden]> wrote:
>>>> >>>
>>>>> >>>> I managed to execute the modex-less changes pretty much without
>>>>> >>>> introducing additional ORTE dependencies into the BTL's, though
>>>>> >>>> there
>>>>> >>>> may be some additions as we look a the other BTLs that I didn't
>>>>> >>>> address. So hopefully that won't contribute too much to the issue
>>>>> >>>> here.
>>>>> >>>>
>>>>> >>>> At the moment, I don't think it matters where notifier sits - it
>>>>> >>>> might
>>>>> >>>> be able to move to OPAL. Only catch will be if some notifier
>>>>> >>>> component
>>>>> >>>> requires communications. I'm thinking of FTB, for example, and
>>>>> >>>> our own
>>>>> >>>> local monitoring program that may require TCP messaging. We don't
>>>>> >>>> currently have anything in OPAL that would support an OPAL level
>>>>> >>>> messaging system, though perhaps that could be resolved.
>>>>> >>>>
>>>>> >>>> We also have dependencies where the BTL's will call orte_ess to
>>>>> >>>> find
>>>>> >>>> out what node another proc is on, the node local rank of that proc,
>>>>> >>>> etc. Those dependencies are likely to grow after the Dec meeting
>>>>> >>>> (see
>>>>> >>>> wiki for that agenda item), and definitely cannot be moved to OPAL.
>>>>> >>>>
>>>>> >>>> However, note that Rich stated the BTL's were -not- moving to OPAL.
>>>>> >>>> This begs the question: where -are- they going? Into their own
>>>>> >>>> layer?
>>>>> >>>> Will that layer be somewhere in-between OMPI and ORTE (in which
>>>>> >>>> case,
>>>>> >>>> the ORTE dependencies are moot)?
>>>>> >>>>
>>>>> >>>> I note that the wiki page doesn't address any of these questions,
>>>>> >>>> which is understandable if things are just getting underway. But it
>>>>> >>>> does sound like this is going to take some thought to ensure we
>>>>> >>>> don't
>>>>> >>>> paint ourselves into a corner.
>>>>> >>>>
>>>>> >>>> Ralph
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> On Dec 3, 2008, at 12:10 PM, Jeff Squyres wrote:
>>>>> >>>>
>>>>>> >>>> > FWIW, I see lots of notifier calls being added to the BTLs (and
>>>>>> >>>> > elsewhere throughout the OMPI code base) over time...
>>>>>> >>>> >
>>>>>> >>>> > On Dec 3, 2008, at 2:07 PM, Tim Mattox wrote:
>>>>>> >>>> >
>>>>>>> >>>> >> The BTLs might have added calls to the notifier framework in
>>>>> >>>> their
>>>>>>> >>>> >> error paths.
>>>>>>> >>>> >> The notifier framework is currently in the ORTE layer... not
>>>>> >>>> sure
>>>>>>> >>>> >> if we could
>>>>>>> >>>> >> move it down to OPAL. Ralph, any thoughts on that?
>>>>>>> >>>> >>
>>>>>>> >>>> >> On Wed, Dec 3, 2008 at 11:56 AM, Richard Graham
>>>>>>> <rlgraham_at_[hidden]
>>>>>> >>>> >
>>>>>>> >>>> >> wrote:
>>>>>>>> >>>> >>> George told me about what he is doing, so no changes would be
>>>>>>>> >>>> >>> committed
>>>>>>>> >>>> >>> until George has his changes in.
>>>>>>>> >>>> >>>
>>>>>>>> >>>> >>> Are there other changes to the btl's that we should be aware
>>>>> >>>> of ?
>>>>>>>> >>>> >>>
>>>>>>>> >>>> >>> Rich
>>>>>>>> >>>> >>>
>>>>>>>> >>>> >>>
>>>>>>>> >>>> >>> On 12/3/08 11:47 AM, "George Bosilca" <bosilca_at_[hidden]>
>>>>> >>>> wrote:
>>>>>>>> >>>> >>>
>>>>>>>>> >>>> >>>> Terry,
>>>>>>>>> >>>> >>>>
>>>>>>>>> >>>> >>>> I'm involved [at some degree] in both efforts and I can
>>>>> >>>> confirm
>>>>>>>>> >>>> >>>> these
>>>>>>>>> >>>> >>>> two efforts will not affect each other in any bad way.
>>>>>>>>> >>>> >>>>
>>>>>>>>> >>>> >>>> george.
>>>>>>>>> >>>> >>>>
>>>>>>>>> >>>> >>>> On Dec 3, 2008, at 11:42 , Terry Dontje wrote:
>>>>>>>>> >>>> >>>>
>>>>>>>>>> >>>> >>>>> I don't have any *strong* objections. However, I know that
>>>>> >>>> Eugene
>>>>>>>>>> >>>> >>>>> and George B have been working on some Fastpath code
changes
>>>>>>>>>> >>>> >>>>> that we
>>>>>>>>>> >>>> >>>>> should make sure neither project obliterates the other.
>>>>>>>>>> >>>> >>>>>
>>>>>>>>>> >>>> >>>>> --td
>>>>>>>>>> >>>> >>>>>
>>>>>>>>>> >>>> >>>>> Richard Graham wrote:
>>>>>>>>>>> >>>> >>>>>> Now that 1.3 will be released, we would like to go ahead
>>>>> >>>> with the
>>>>>>>>>>> >>>> >>>>>> plan to move the btl¹s out of the MPI layer. Greg Koenig
>>>>> >>>> who is
>>>>>>>>>>> >>>> >>>>>> doing most of the work has started a wiki page with
>>>>> >>>> details on
>>>>>>>>>>> >>>> >>>>>> the
>>>>>>>>>>> >>>> >>>>>> plans. Right now details are sketchy, as Greg is digging
>>>>> >>>> through
>>>>>>>>>>> >>>> >>>>>> the code, and has only hand written notes on data
>>>>> >>>> structures that
>>>>>>>>>>> >>>> >>>>>> need to be moved, include files that are not needed,
etc.
>>>>> >>>> The
>>>>>>>>>>> >>>> >>>>>> page
>>>>>>>>>>> >>>> >>>>>> is at:
>>>>>>>>>>> >>>> >>>>>> _https://svn.open-mpi.org/trac/ompi/wiki/BTLExtraction_
>>>>>>>>>>> >>>> >>>>>>
>>>>>>>>>>> >>>> >>>>>> The first three steps basically only involve code
motion,
>>>>> >>>> moving
>>>>>>>>>>> >>>> >>>>>> items such as ompi_list, and renaming them, moving where
>>>>> >>>> the code
>>>>>>>>>>> >>>> >>>>>> is actually located in the repository, and the like. For
>>>>> >>>> these we
>>>>>>>>>>> >>>> >>>>>> do not plan to put out a formal RFC, but comments are
very
>>>>>>>>>>> >>>> >>>>>> welcome,
>>>>>>>>>>> >>>> >>>>>> and any hands that are willing to help with this are
even
>>>>> >>>> more
>>>>>>>>>>> >>>> >>>>>> welcome.
>>>>>>>>>>> >>>> >>>>>>
>>>>>>>>>>> >>>> >>>>>> The last phase where the btl¹s are made dependent on
OPAL,
>>>>> >>>> and
>>>>>>>>>>> >>>> >>>>>> supporting libraries such as mpools I expect will be
>>>>> >>>> disruptive,
>>>>>>>>>>> >>>> >>>>>> and will definitely require an RFC, and will also be a
>>>>> >>>> longer
>>>>>>>>>>> >>>> >>>>>> process.
>>>>>>>>>>> >>>> >>>>>>
>>>>>>>>>>> >>>> >>>>>> Please send comments,
>>>>>>>>>>> >>>> >>>>>> Rich
>>>>>>>>>>> >>>> >>>>>>
>>>>> >>>>
>>>>> ------------------------------------------------------------------------
>>>>>>>>>>> >>>> >>>>>>
>>>>>>>>>>> >>>> >>>>>> _______________________________________________
>>>>>>>>>>> >>>> >>>>>> devel mailing list
>>>>>>>>>>> >>>> >>>>>> devel_at_[hidden]
>>>>>>>>>>> >>>> >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>>>> >>>> >>>>>>
>>>>>>>>>> >>>> >>>>>
>>>>>>>>>> >>>> >>>>> _______________________________________________
>>>>>>>>>> >>>> >>>>> devel mailing list
>>>>>>>>>> >>>> >>>>> devel_at_[hidden]
>>>>>>>>>> >>>> >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>> >>>> >>>>
>>>>>>>>> >>>> >>>>
>>>>>>>>> >>>> >>>> _______________________________________________
>>>>>>>>> >>>> >>>> devel mailing list
>>>>>>>>> >>>> >>>> devel_at_[hidden]
>>>>>>>>> >>>> >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>> >>>> >>>
>>>>>>>> >>>> >>>
>>>>>>>> >>>> >>> _______________________________________________
>>>>>>>> >>>> >>> devel mailing list
>>>>>>>> >>>> >>> devel_at_[hidden]
>>>>>>>> >>>> >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>> >>>> >>>
>>>>>>> >>>> >>
>>>>>>> >>>> >>
>>>>>>> >>>> >>
>>>>>>> >>>> >> --
>>>>>>> >>>> >> Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/
>>>>>>> >>>> >> tmattox_at_[hidden] || timattox_at_[hidden]
>>>>>>> >>>> >> I'm a bright... http://www.the-brights.net/
>>>>>>> >>>> >>
>>>>>>> >>>> >> _______________________________________________
>>>>>>> >>>> >> devel mailing list
>>>>>>> >>>> >> devel_at_[hidden]
>>>>>>> >>>> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>> >>>> >
>>>>>> >>>> >
>>>>>> >>>> > --
>>>>>> >>>> > Jeff Squyres
>>>>>> >>>> > Cisco Systems
>>>>>> >>>> >
>>>>>> >>>> >
>>>>>> >>>> > _______________________________________________
>>>>>> >>>> > devel mailing list
>>>>>> >>>> > devel_at_[hidden]
>>>>>> >>>> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> _______________________________________________
>>>>> >>>> devel mailing list
>>>>> >>>> devel_at_[hidden]
>>>>> >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>> >>>>
>>>> >>>
>>>> >>> _______________________________________________
>>>> >>> devel mailing list
>>>> >>> devel_at_[hidden]
>>>> >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> >> _______________________________________________
>>> >> devel mailing list
>>> >> devel_at_[hidden]
>>> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> >
>> >
>> > --
>> > Jeff Squyres
>> > Cisco Systems
>> >
>> >
>> > _______________________________________________
>> > devel mailing list
>> > devel_at_[hidden]
>> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>