Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [RFC] mca_base_select()
From: Ralph Castain (rhc_at_[hidden])
Date: 2008-05-09 11:14:27


Sure - take a look at the hg repository Jeff and I are working on:

http://www.open-mpi.org/hg/hgwebdir.cgi/rhc/channel

Te opal/mca/filter framework illustrates the problem. I have one component
in there right now, with a default module defined in the base. That
component must only be selected if the user calls it. With the current
select logic, I can't do this - if the priority is >=0, then it always is
automatically selected. Priority < 0, never selectable even if specified.

Thanks
Ralph

On 5/9/08 8:52 AM, "Josh Hursey" <jjhursey_at_[hidden]> wrote:

> Ralph,
>
> Can you give me an example of a component that I can look at? It will
> allow me to test the fix before committing, and to better understand
> the problem.
>
> -- Josh
>
> On May 9, 2008, at 10:41 AM, Ralph Castain wrote:
>
>> I just hit a problem with this logic - should be a minor change.
>>
>> We have several frameworks where we have components that are only
>> allowed be
>> selected if the user specifically requests them by stating -mca foo
>> bar.
>> Because it is possible for there to be no other components that want
>> to be
>> selected, and because it is permissible for no components to be
>> selected for
>> that framework, we set bar's priority to be -1.
>>
>> The new select logic will not allow a negative priority to be
>> selected, even
>> if the user specifically requested that component.
>>
>> If we set the priority to be 0, then the system will allow the
>> component to
>> be automatically selected. This is not allowed as it can lead to bad
>> behavior.
>>
>> So what we need the select system to do is say "if someone specified a
>> specific component, don't worry about the returned priority - just
>> use it"
>>
>> Josh: could you please modify this?
>>
>> Thanks!
>> Ralph
>>
>>
>>
>> On 5/8/08 7:04 PM, "Pak Lui" <Pak.Lui_at_[hidden]> wrote:
>>
>>> Thanks very much Josh! Will try it out soon.
>>>
>>> Josh Hursey wrote:
>>>> Sorry about that. I didn't test that type of option. It should be
>>>> working in r18418. Let me know if you see any more issues.
>>>>
>>>> -- Josh
>>>>
>>>> On May 8, 2008, at 6:04 PM, Pak Lui wrote:
>>>>
>>>>> I think I have a problem but I am not sure. I used to be able to
>>>>> use the
>>>>> circumflex (^) to switch between the gridengine launcher and the
>>>>> ssh
>>>>> launchers by doing something like this, e.g. -mca plm
>>>>> ^gridengine, to
>>>>> exclude some of the components plm (and also in ras). It doesn't
>>>>> seem
>>>>> like the 'negate' is in mca_base_component anymore. I guess I
>>>>> just have
>>>>> to spell out which component I want explicitly...
>>>>>
>>>>> Josh Hursey wrote:
>>>>>> This has been committed in r18381
>>>>>>
>>>>>> Please let me know if you have any problems with this commit.
>>>>>>
>>>>>> Cheers,
>>>>>> Josh
>>>>>>
>>>>>> On May 5, 2008, at 10:41 AM, Josh Hursey wrote:
>>>>>>
>>>>>>> Awesome.
>>>>>>>
>>>>>>> The branch is updated to the latest trunk head. I encourage
>>>>>>> folks to
>>>>>>> check out this repository and make sure that it builds on their
>>>>>>> system. A normal build of the branch should be enough to find
>>>>>>> out if
>>>>>>> there are any cut-n-paste problems (though I tried to be careful,
>>>>>>> mistakes do happen).
>>>>>>>
>>>>>>> I haven't heard any problems so this is looking like it will
>>>>>>> come in
>>>>>>> tomorrow after the teleconf. I'll ask again there to see if
>>>>>>> there are
>>>>>>> any voices of concern.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Josh
>>>>>>>
>>>>>>> On May 5, 2008, at 9:58 AM, Jeff Squyres wrote:
>>>>>>>
>>>>>>>> This all sounds good to me!
>>>>>>>>
>>>>>>>> On Apr 29, 2008, at 6:35 PM, Josh Hursey wrote:
>>>>>>>>
>>>>>>>>> What: Add mca_base_select() and adjust frameworks &
>>>>>>>>> components to
>>>>>>>>> use
>>>>>>>>> it.
>>>>>>>>> Why: Consolidation of code for general goodness.
>>>>>>>>> Where: https://svn.open-mpi.org/svn/ompi/tmp-public/jjh-mca-
>>>>>>>>> play
>>>>>>>>> When: Code ready now. Documentation ready soon.
>>>>>>>>> Timeout: May 6, 2008 (After teleconf) [1 week]
>>>>>>>>>
>>>>>>>>> Discussion:
>>>>>>>>> -----------
>>>>>>>>> For a number of years a few developers have been talking about
>>>>>>>>> creating a MCA base component selection function. For various
>>>>>>>>> reasons
>>>>>>>>> this was never implemented. Recently I decided to give it a
>>>>>>>>> try.
>>>>>>>>>
>>>>>>>>> A base select function will allow Open MPI to provide
>>>>>>>>> completely
>>>>>>>>> consistent selection behavior for many of its frameworks (18
>>>>>>>>> of 31
>>>>>>>>> to
>>>>>>>>> be exact at the moment). The primary goal of this work is to
>>>>>>>>> improving
>>>>>>>>> code maintainability through code reuse. Other benefits also
>>>>>>>>> result
>>>>>>>>> such as a slightly smaller memory footprint.
>>>>>>>>>
>>>>>>>>> The mca_base_select() function represented the most commonly
>>>>>>>>> used
>>>>>>>>> logic for component selection: Select the one component with
>>>>>>>>> the
>>>>>>>>> highest priority and close all of the not selected
>>>>>>>>> components. This
>>>>>>>>> function can be found at the path below in the branch:
>>>>>>>>> opal/mca/base/mca_base_components_select.c
>>>>>>>>>
>>>>>>>>> To support this I had to formalize a query() function in the
>>>>>>>>> mca_base_component_t of the form:
>>>>>>>>> int mca_base_query_component_fn(mca_base_module_t **module, int
>>>>>>>>> *priority);
>>>>>>>>>
>>>>>>>>> This function is specified after the open and close component
>>>>>>>>> functions in this structure as to allow compatibility with
>>>>>>>>> frameworks
>>>>>>>>> that do not use the base selection logic. Frameworks that do
>>>>>>>>> *not*
>>>>>>>>> use
>>>>>>>>> this function are *not* effected by this commit. However, every
>>>>>>>>> component in the frameworks that use the mca_base_select
>>>>>>>>> function
>>>>>>>>> must
>>>>>>>>> adjust their component query function to fit that specified
>>>>>>>>> above.
>>>>>>>>>
>>>>>>>>> 18 frameworks in Open MPI have been changed. I have updated
>>>>>>>>> all of
>>>>>>>>> the
>>>>>>>>> components in the 18 frameworks available in the trunk on my
>>>>>>>>> branch.
>>>>>>>>> The effected frameworks are:
>>>>>>>>> - OPAL Carto
>>>>>>>>> - OPAL crs
>>>>>>>>> - OPAL maffinity
>>>>>>>>> - OPAL memchecker
>>>>>>>>> - OPAL paffinity
>>>>>>>>> - ORTE errmgr
>>>>>>>>> - ORTE ess
>>>>>>>>> - ORTE Filem
>>>>>>>>> - ORTE grpcomm
>>>>>>>>> - ORTE odls
>>>>>>>>> - ORTE pml
>>>>>>>>> - ORTE ras
>>>>>>>>> - ORTE rmaps
>>>>>>>>> - ORTE routed
>>>>>>>>> - ORTE snapc
>>>>>>>>> - OMPI crcp
>>>>>>>>> - OMPI dpm
>>>>>>>>> - OMPI pubsub
>>>>>>>>>
>>>>>>>>> There was a question of the memory footprint change as a
>>>>>>>>> result of
>>>>>>>>> this commit. I used 'pmap' to determine process memory
>>>>>>>>> footprint
>>>>>>>>> of a
>>>>>>>>> hello world MPI program. Static and Shared build numbers are
>>>>>>>>> below
>>>>>>>>> along with variations on launching locally and to a single node
>>>>>>>>> allocated by SLURM. All of this was on Indiana University's
>>>>>>>>> Odin
>>>>>>>>> machine. We compare against the trunk (r18276) representing
>>>>>>>>> the last
>>>>>>>>> SVN sync point of the branch.
>>>>>>>>>
>>>>>>>>> Process(shared)| Trunk | Branch | Diff (Improvement)
>>>>>>>>> ---------------+----------+---------+-------
>>>>>>>>> mpirun (orted) | 39976K | 36828K | 3148K
>>>>>>>>> hello (0) | 229288K | 229268K | 20K
>>>>>>>>> hello (1) | 229288K | 229268K | 20K
>>>>>>>>> ---------------+----------+---------+-------
>>>>>>>>> mpirun | 40032K | 37924K | 2108K
>>>>>>>>> orted | 34720K | 34660K | 60K
>>>>>>>>> hello (0) | 228404K | 228384K | 20K
>>>>>>>>> hello (1) | 228404K | 228384K | 20K
>>>>>>>>>
>>>>>>>>> Process(static)| Trunk | Branch | Diff (Improvement)
>>>>>>>>> ---------------+----------+---------+-------
>>>>>>>>> mpirun (orted) | 21384K | 21372K | 12K
>>>>>>>>> hello (0) | 194000K | 193980K | 20K
>>>>>>>>> hello (1) | 194000K | 193980K | 20K
>>>>>>>>> ---------------+----------+---------+-------
>>>>>>>>> mpirun | 21384K | 21372K | 12K
>>>>>>>>> orted | 21208K | 21196K | 12K
>>>>>>>>> hello (0) | 193116K | 193096K | 20K
>>>>>>>>> hello (1) | 193116K | 193096K | 20K
>>>>>>>>>
>>>>>>>>> As you can see there are some small memory footprint
>>>>>>>>> improvements on
>>>>>>>>> my branch that result from this work. The size of the Open MPI
>>>>>>>>> project
>>>>>>>>> shrinks a bit as well. This commit cuts between 3,500 and 2,000
>>>>>>>>> lines
>>>>>>>>> of code (depending on how you count) so about a ~1% code
>>>>>>>>> shrink.
>>>>>>>>>
>>>>>>>>> The branch is stable in all of the testing I have done, but
>>>>>>>>> there
>>>>>>>>> are
>>>>>>>>> some platforms on which I cannot test. So please give this
>>>>>>>>> branch a
>>>>>>>>> try and let me know if you find any problems.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Josh
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> devel mailing list
>>>>>>>>> devel_at_[hidden]
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>
>>>>>>>> --
>>>>>>>> Jeff Squyres
>>>>>>>> Cisco Systems
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> devel mailing list
>>>>>>>> devel_at_[hidden]
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>> _______________________________________________
>>>>>>> devel mailing list
>>>>>>> devel_at_[hidden]
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> devel_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> - Pak Lui
>>>>> pak.lui_at_[hidden]
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel