Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Question about priority
From: George Bosilca (bosilca_at_[hidden])
Date: 2008-05-23 13:06:35


On May 23, 2008, at 9:56 AM, Josh Hursey wrote:

> Unfortunately, as Jeff pointed out, the behavior of frameworks and
> components in determining component selection is not consistent in the
> codebase. The mca_base_select() commit made things much better, but
> there are still frameworks that do not (or cannot) use it, and there
> are some behaviors that are just not well defined.
>
> Consistency issues lead to user (and developer) confusion and degrade
> the image of the project. For exactly those reasons I want to talk
> about a number of such issues in one of our technical meeting this
> summer (this issue is currently scheduled for the July meeting). The
> goal is to come out of that meeting with a coding standard behavior
> for components during open/selection/close. Frameworks and components
> can diverge from this base standard, but then it is the responsibility
> of the component writer to make sure this is clearly communicated to
> users about expectations.

This is a pretty strong statement and some examples are welcomed.
Anyway, we already have a coding standard for the component
manipulation, and apparently there are cases when we need a hand
crafted selection logic (such as collectives as Josh pointed it out).
The ^component is managed at the bottom layer, where we create the
list of components to be opened, so this is consistent across the board.

> To answer your question though, an individual component can determine
> what to return for the {priority,module} pair based on anything it
> wishes. For instance the SLURM PLM component will return NULL if it
> does not see the correct environment variables, and a working module
> if it does. Collectives are a special type of framework so the
> selection logic there is specialized, meaning it does not use the
> mca_base_select function, but uses a more custom version of select.
>
> If you supply "^component" then the component is never opened and thus
> never queried during selection. If you specify 0 for the priority of
> the hierarch component the the component is opened, and will just
> return NULL during selection. If you specify > 0 for the priority then
> the hierarch component will return a module to the selection code.
> This module will be used if the hierarch component has the 'best'
> priority, otherwise the hierarch component should be closed
> [hierarch_component_close] at the end of the selection code.
> Determining the 'best' priority and whether or not the components are
> closed at the end of selection is determined by the coll/base select
> function.
>
> I think I may have just made things seem more complex than they
> probably are.

I don't think so. For me the process is straightforward. Here are the
possible scenarios:
1) ^component behave as if the corresponding file (i.e. shared
library) is not available.
2) init returning a NULL module, means that this component do not
desire to be selected. There is no need to clarify the reason why, the
outcome is that the component selected to be ignored.
3) returning a non NULL module and a priority allow the selection
logic to include the specified module in the selection process. Of
course the selection process is different for some framework, but this
is to be expected. Keep in mind that while there are one-to-one
framework (such as the IO subsystem and the PML) and many-to-one
framework (such as the BTLs and the collectives) the priority always
allow the selector to order the modules based on the decreasing
priority. Then, based on the type of the framework (one-to-one or many-
to-one), the selector pick the first or all modules from the list and
close the others. As I said ... straightforward :)

   george.

>
>
> -- Josh
>
> On May 23, 2008, at 8:28 AM, Jeff Squyres wrote:
>
>> I think that technically, the component can do whatever it wants
>> (e.g., look at its priority, see 0, and decide to return NULL).
>> However, to be consistent, we should decide on a specific behavior
>> and
>> make it uniform to all components.
>>
>> I'd opt for the ^foo notation to disable a component.
>>
>>
>> On May 23, 2008, at 8:14 AM, Rolf Vandevaart wrote:
>>
>>>
>>> This mostly makes sense. But let me probe a little more. Can a
>>> component return NULL if it looks at its priority and the priority
>>> is
>>> less than or equal to 0? For example, currently the hierarch
>>> component
>>> returns NULL when its priority is equal or less than 0. This means
>>> that
>>> as a user when I set the priority to 0 I am indicating that I do not
>>> want the hierarch component selected at all.
>>>
>>> Or, is the priority only used to specify relative behavior. So, it
>>> is
>>> not to be used to completely deselect a component. To deselect, you
>>> would need to use the ^component format.
>>>
>>> That is where I am confused.
>>>
>>> Rolf
>>>
>>>
>>> Josh Hursey wrote:
>>>> Yeah (Sorry I didn't reply earlier).
>>>>
>>>> Each component is asked for at least two items of information:
>>>> priority (int), and module (struct *).
>>>>
>>>> The priority can range from [INT_MIN | INT_MAX] with the highest
>>>> priority selected, even if that priority is negative.
>>>>
>>>> If the component does not want to be selected then it should return
>>>> NULL for the module value. This indicates to the selection logic
>>>> that
>>>> no matter what the priority is set to the component should not be a
>>>> candidate for selection.
>>>>
>>>> So a component is selectable if it returns a non-NULL value for the
>>>> module struct, and is not selectable if it returns NULL. The
>>>> priority
>>>> only indicates relative rank between all available components.
>>>>
>>>> Does that make sense? I should probably add this comment to the
>>>> mca_base_select function to preserve it. I'll make a bug for it so
>>>> it
>>>> doesn't get lost.
>>>>
>>>> -- Josh
>>>>
>>>> On May 23, 2008, at 7:14 AM, Jeff Squyres wrote:
>>>>
>>>>> We may not have this uniform throughout the code base -- this is
>>>>> one
>>>>> of the things we wanted to talk about in the Bay area meeting. I
>>>>> believe that the allowable range for priorities should be [0,
>>>>> 100],
>>>>> and that if you don't want to be selected, you should return NULL
>>>>> (or
>>>>> use some other mechanism to indicate that you didn't want to be
>>>>> selected). That was the original intent of the MCA selection
>>>>> mechanisms, at least.
>>>>>
>>>>> Josh -- is this consistent with what you found when you
>>>>> consolidated a
>>>>> lot of this stuff?
>>>>>
>>>>> On May 22, 2008, at 11:30 AM, Rolf vandeVaart wrote:
>>>>>
>>>>>> I know there was some recent discussion about priority of
>>>>>> components,
>>>>>> but I wanted to double check. I am trying to understand what
>>>>>> priority =
>>>>>> 0 means.
>>>>>>
>>>>>> My assumption is the following:
>>>>>> priority >= 0 means the component is selectable
>>>>>> priority < 0 means the component is not selectable
>>>>>>
>>>>>> I ask this because in some of the collective code it looks like a
>>>>>> priority = 0 means not selectable. Not a big deal, but I am
>>>>>> trying to
>>>>>> fix a memory leak and I need to get this piece right. And I
>>>>>> assume
>>>>>> that
>>>>>> priority < 0 will give one the same behavior as ^component but
>>>>>> the
>>>>>> code
>>>>>> paths within Open MPI would be different.
>>>>>>
>>>>>> Rolf
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> devel_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>
>>>>> --
>>>>> Jeff Squyres
>>>>> Cisco Systems
>>>>>
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>>
>>> --
>>>
>>> =========================
>>> rolf.vandevaart_at_[hidden]
>>> 781-442-3043
>>> =========================
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel



  • application/pkcs7-signature attachment: smime.p7s