Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [RFC] mca_base_select()
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2008-05-09 10:52:19


Ralph,

Can you give me an example of a component that I can look at? It will
allow me to test the fix before committing, and to better understand
the problem.

-- Josh

On May 9, 2008, at 10:41 AM, Ralph Castain wrote:

> I just hit a problem with this logic - should be a minor change.
>
> We have several frameworks where we have components that are only
> allowed be
> selected if the user specifically requests them by stating -mca foo
> bar.
> Because it is possible for there to be no other components that want
> to be
> selected, and because it is permissible for no components to be
> selected for
> that framework, we set bar's priority to be -1.
>
> The new select logic will not allow a negative priority to be
> selected, even
> if the user specifically requested that component.
>
> If we set the priority to be 0, then the system will allow the
> component to
> be automatically selected. This is not allowed as it can lead to bad
> behavior.
>
> So what we need the select system to do is say "if someone specified a
> specific component, don't worry about the returned priority - just
> use it"
>
> Josh: could you please modify this?
>
> Thanks!
> Ralph
>
>
>
> On 5/8/08 7:04 PM, "Pak Lui" <Pak.Lui_at_[hidden]> wrote:
>
>> Thanks very much Josh! Will try it out soon.
>>
>> Josh Hursey wrote:
>>> Sorry about that. I didn't test that type of option. It should be
>>> working in r18418. Let me know if you see any more issues.
>>>
>>> -- Josh
>>>
>>> On May 8, 2008, at 6:04 PM, Pak Lui wrote:
>>>
>>>> I think I have a problem but I am not sure. I used to be able to
>>>> use the
>>>> circumflex (^) to switch between the gridengine launcher and the
>>>> ssh
>>>> launchers by doing something like this, e.g. -mca plm
>>>> ^gridengine, to
>>>> exclude some of the components plm (and also in ras). It doesn't
>>>> seem
>>>> like the 'negate' is in mca_base_component anymore. I guess I
>>>> just have
>>>> to spell out which component I want explicitly...
>>>>
>>>> Josh Hursey wrote:
>>>>> This has been committed in r18381
>>>>>
>>>>> Please let me know if you have any problems with this commit.
>>>>>
>>>>> Cheers,
>>>>> Josh
>>>>>
>>>>> On May 5, 2008, at 10:41 AM, Josh Hursey wrote:
>>>>>
>>>>>> Awesome.
>>>>>>
>>>>>> The branch is updated to the latest trunk head. I encourage
>>>>>> folks to
>>>>>> check out this repository and make sure that it builds on their
>>>>>> system. A normal build of the branch should be enough to find
>>>>>> out if
>>>>>> there are any cut-n-paste problems (though I tried to be careful,
>>>>>> mistakes do happen).
>>>>>>
>>>>>> I haven't heard any problems so this is looking like it will
>>>>>> come in
>>>>>> tomorrow after the teleconf. I'll ask again there to see if
>>>>>> there are
>>>>>> any voices of concern.
>>>>>>
>>>>>> Cheers,
>>>>>> Josh
>>>>>>
>>>>>> On May 5, 2008, at 9:58 AM, Jeff Squyres wrote:
>>>>>>
>>>>>>> This all sounds good to me!
>>>>>>>
>>>>>>> On Apr 29, 2008, at 6:35 PM, Josh Hursey wrote:
>>>>>>>
>>>>>>>> What: Add mca_base_select() and adjust frameworks &
>>>>>>>> components to
>>>>>>>> use
>>>>>>>> it.
>>>>>>>> Why: Consolidation of code for general goodness.
>>>>>>>> Where: https://svn.open-mpi.org/svn/ompi/tmp-public/jjh-mca-
>>>>>>>> play
>>>>>>>> When: Code ready now. Documentation ready soon.
>>>>>>>> Timeout: May 6, 2008 (After teleconf) [1 week]
>>>>>>>>
>>>>>>>> Discussion:
>>>>>>>> -----------
>>>>>>>> For a number of years a few developers have been talking about
>>>>>>>> creating a MCA base component selection function. For various
>>>>>>>> reasons
>>>>>>>> this was never implemented. Recently I decided to give it a
>>>>>>>> try.
>>>>>>>>
>>>>>>>> A base select function will allow Open MPI to provide
>>>>>>>> completely
>>>>>>>> consistent selection behavior for many of its frameworks (18
>>>>>>>> of 31
>>>>>>>> to
>>>>>>>> be exact at the moment). The primary goal of this work is to
>>>>>>>> improving
>>>>>>>> code maintainability through code reuse. Other benefits also
>>>>>>>> result
>>>>>>>> such as a slightly smaller memory footprint.
>>>>>>>>
>>>>>>>> The mca_base_select() function represented the most commonly
>>>>>>>> used
>>>>>>>> logic for component selection: Select the one component with
>>>>>>>> the
>>>>>>>> highest priority and close all of the not selected
>>>>>>>> components. This
>>>>>>>> function can be found at the path below in the branch:
>>>>>>>> opal/mca/base/mca_base_components_select.c
>>>>>>>>
>>>>>>>> To support this I had to formalize a query() function in the
>>>>>>>> mca_base_component_t of the form:
>>>>>>>> int mca_base_query_component_fn(mca_base_module_t **module, int
>>>>>>>> *priority);
>>>>>>>>
>>>>>>>> This function is specified after the open and close component
>>>>>>>> functions in this structure as to allow compatibility with
>>>>>>>> frameworks
>>>>>>>> that do not use the base selection logic. Frameworks that do
>>>>>>>> *not*
>>>>>>>> use
>>>>>>>> this function are *not* effected by this commit. However, every
>>>>>>>> component in the frameworks that use the mca_base_select
>>>>>>>> function
>>>>>>>> must
>>>>>>>> adjust their component query function to fit that specified
>>>>>>>> above.
>>>>>>>>
>>>>>>>> 18 frameworks in Open MPI have been changed. I have updated
>>>>>>>> all of
>>>>>>>> the
>>>>>>>> components in the 18 frameworks available in the trunk on my
>>>>>>>> branch.
>>>>>>>> The effected frameworks are:
>>>>>>>> - OPAL Carto
>>>>>>>> - OPAL crs
>>>>>>>> - OPAL maffinity
>>>>>>>> - OPAL memchecker
>>>>>>>> - OPAL paffinity
>>>>>>>> - ORTE errmgr
>>>>>>>> - ORTE ess
>>>>>>>> - ORTE Filem
>>>>>>>> - ORTE grpcomm
>>>>>>>> - ORTE odls
>>>>>>>> - ORTE pml
>>>>>>>> - ORTE ras
>>>>>>>> - ORTE rmaps
>>>>>>>> - ORTE routed
>>>>>>>> - ORTE snapc
>>>>>>>> - OMPI crcp
>>>>>>>> - OMPI dpm
>>>>>>>> - OMPI pubsub
>>>>>>>>
>>>>>>>> There was a question of the memory footprint change as a
>>>>>>>> result of
>>>>>>>> this commit. I used 'pmap' to determine process memory
>>>>>>>> footprint
>>>>>>>> of a
>>>>>>>> hello world MPI program. Static and Shared build numbers are
>>>>>>>> below
>>>>>>>> along with variations on launching locally and to a single node
>>>>>>>> allocated by SLURM. All of this was on Indiana University's
>>>>>>>> Odin
>>>>>>>> machine. We compare against the trunk (r18276) representing
>>>>>>>> the last
>>>>>>>> SVN sync point of the branch.
>>>>>>>>
>>>>>>>> Process(shared)| Trunk | Branch | Diff (Improvement)
>>>>>>>> ---------------+----------+---------+-------
>>>>>>>> mpirun (orted) | 39976K | 36828K | 3148K
>>>>>>>> hello (0) | 229288K | 229268K | 20K
>>>>>>>> hello (1) | 229288K | 229268K | 20K
>>>>>>>> ---------------+----------+---------+-------
>>>>>>>> mpirun | 40032K | 37924K | 2108K
>>>>>>>> orted | 34720K | 34660K | 60K
>>>>>>>> hello (0) | 228404K | 228384K | 20K
>>>>>>>> hello (1) | 228404K | 228384K | 20K
>>>>>>>>
>>>>>>>> Process(static)| Trunk | Branch | Diff (Improvement)
>>>>>>>> ---------------+----------+---------+-------
>>>>>>>> mpirun (orted) | 21384K | 21372K | 12K
>>>>>>>> hello (0) | 194000K | 193980K | 20K
>>>>>>>> hello (1) | 194000K | 193980K | 20K
>>>>>>>> ---------------+----------+---------+-------
>>>>>>>> mpirun | 21384K | 21372K | 12K
>>>>>>>> orted | 21208K | 21196K | 12K
>>>>>>>> hello (0) | 193116K | 193096K | 20K
>>>>>>>> hello (1) | 193116K | 193096K | 20K
>>>>>>>>
>>>>>>>> As you can see there are some small memory footprint
>>>>>>>> improvements on
>>>>>>>> my branch that result from this work. The size of the Open MPI
>>>>>>>> project
>>>>>>>> shrinks a bit as well. This commit cuts between 3,500 and 2,000
>>>>>>>> lines
>>>>>>>> of code (depending on how you count) so about a ~1% code
>>>>>>>> shrink.
>>>>>>>>
>>>>>>>> The branch is stable in all of the testing I have done, but
>>>>>>>> there
>>>>>>>> are
>>>>>>>> some platforms on which I cannot test. So please give this
>>>>>>>> branch a
>>>>>>>> try and let me know if you find any problems.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Josh
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> devel mailing list
>>>>>>>> devel_at_[hidden]
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>
>>>>>>> --
>>>>>>> Jeff Squyres
>>>>>>> Cisco Systems
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> devel mailing list
>>>>>>> devel_at_[hidden]
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> devel_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>>
>>>> --
>>>>
>>>> - Pak Lui
>>>> pak.lui_at_[hidden]
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel