Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [RFC] mca_base_select()
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2008-05-06 14:09:43


This has been committed in r18381

Please let me know if you have any problems with this commit.

Cheers,
Josh

On May 5, 2008, at 10:41 AM, Josh Hursey wrote:

> Awesome.
>
> The branch is updated to the latest trunk head. I encourage folks to
> check out this repository and make sure that it builds on their
> system. A normal build of the branch should be enough to find out if
> there are any cut-n-paste problems (though I tried to be careful,
> mistakes do happen).
>
> I haven't heard any problems so this is looking like it will come in
> tomorrow after the teleconf. I'll ask again there to see if there are
> any voices of concern.
>
> Cheers,
> Josh
>
> On May 5, 2008, at 9:58 AM, Jeff Squyres wrote:
>
>> This all sounds good to me!
>>
>> On Apr 29, 2008, at 6:35 PM, Josh Hursey wrote:
>>
>>> What: Add mca_base_select() and adjust frameworks & components to
>>> use
>>> it.
>>> Why: Consolidation of code for general goodness.
>>> Where: https://svn.open-mpi.org/svn/ompi/tmp-public/jjh-mca-play
>>> When: Code ready now. Documentation ready soon.
>>> Timeout: May 6, 2008 (After teleconf) [1 week]
>>>
>>> Discussion:
>>> -----------
>>> For a number of years a few developers have been talking about
>>> creating a MCA base component selection function. For various
>>> reasons
>>> this was never implemented. Recently I decided to give it a try.
>>>
>>> A base select function will allow Open MPI to provide completely
>>> consistent selection behavior for many of its frameworks (18 of 31
>>> to
>>> be exact at the moment). The primary goal of this work is to
>>> improving
>>> code maintainability through code reuse. Other benefits also result
>>> such as a slightly smaller memory footprint.
>>>
>>> The mca_base_select() function represented the most commonly used
>>> logic for component selection: Select the one component with the
>>> highest priority and close all of the not selected components. This
>>> function can be found at the path below in the branch:
>>> opal/mca/base/mca_base_components_select.c
>>>
>>> To support this I had to formalize a query() function in the
>>> mca_base_component_t of the form:
>>> int mca_base_query_component_fn(mca_base_module_t **module, int
>>> *priority);
>>>
>>> This function is specified after the open and close component
>>> functions in this structure as to allow compatibility with
>>> frameworks
>>> that do not use the base selection logic. Frameworks that do *not*
>>> use
>>> this function are *not* effected by this commit. However, every
>>> component in the frameworks that use the mca_base_select function
>>> must
>>> adjust their component query function to fit that specified above.
>>>
>>> 18 frameworks in Open MPI have been changed. I have updated all of
>>> the
>>> components in the 18 frameworks available in the trunk on my branch.
>>> The effected frameworks are:
>>> - OPAL Carto
>>> - OPAL crs
>>> - OPAL maffinity
>>> - OPAL memchecker
>>> - OPAL paffinity
>>> - ORTE errmgr
>>> - ORTE ess
>>> - ORTE Filem
>>> - ORTE grpcomm
>>> - ORTE odls
>>> - ORTE pml
>>> - ORTE ras
>>> - ORTE rmaps
>>> - ORTE routed
>>> - ORTE snapc
>>> - OMPI crcp
>>> - OMPI dpm
>>> - OMPI pubsub
>>>
>>> There was a question of the memory footprint change as a result of
>>> this commit. I used 'pmap' to determine process memory footprint
>>> of a
>>> hello world MPI program. Static and Shared build numbers are below
>>> along with variations on launching locally and to a single node
>>> allocated by SLURM. All of this was on Indiana University's Odin
>>> machine. We compare against the trunk (r18276) representing the last
>>> SVN sync point of the branch.
>>>
>>> Process(shared)| Trunk | Branch | Diff (Improvement)
>>> ---------------+----------+---------+-------
>>> mpirun (orted) | 39976K | 36828K | 3148K
>>> hello (0) | 229288K | 229268K | 20K
>>> hello (1) | 229288K | 229268K | 20K
>>> ---------------+----------+---------+-------
>>> mpirun | 40032K | 37924K | 2108K
>>> orted | 34720K | 34660K | 60K
>>> hello (0) | 228404K | 228384K | 20K
>>> hello (1) | 228404K | 228384K | 20K
>>>
>>> Process(static)| Trunk | Branch | Diff (Improvement)
>>> ---------------+----------+---------+-------
>>> mpirun (orted) | 21384K | 21372K | 12K
>>> hello (0) | 194000K | 193980K | 20K
>>> hello (1) | 194000K | 193980K | 20K
>>> ---------------+----------+---------+-------
>>> mpirun | 21384K | 21372K | 12K
>>> orted | 21208K | 21196K | 12K
>>> hello (0) | 193116K | 193096K | 20K
>>> hello (1) | 193116K | 193096K | 20K
>>>
>>> As you can see there are some small memory footprint improvements on
>>> my branch that result from this work. The size of the Open MPI
>>> project
>>> shrinks a bit as well. This commit cuts between 3,500 and 2,000
>>> lines
>>> of code (depending on how you count) so about a ~1% code shrink.
>>>
>>> The branch is stable in all of the testing I have done, but there
>>> are
>>> some platforms on which I cannot test. So please give this branch a
>>> try and let me know if you find any problems.
>>>
>>> Cheers,
>>> Josh
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel