Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [RFC] mca_base_select()
From: Pak Lui (Pak.Lui_at_[hidden])
Date: 2008-05-08 18:04:10


I think I have a problem but I am not sure. I used to be able to use the
circumflex (^) to switch between the gridengine launcher and the ssh
launchers by doing something like this, e.g. -mca plm ^gridengine, to
exclude some of the components plm (and also in ras). It doesn't seem
like the 'negate' is in mca_base_component anymore. I guess I just have
  to spell out which component I want explicitly...

Josh Hursey wrote:
> This has been committed in r18381
>
> Please let me know if you have any problems with this commit.
>
> Cheers,
> Josh
>
> On May 5, 2008, at 10:41 AM, Josh Hursey wrote:
>
>> Awesome.
>>
>> The branch is updated to the latest trunk head. I encourage folks to
>> check out this repository and make sure that it builds on their
>> system. A normal build of the branch should be enough to find out if
>> there are any cut-n-paste problems (though I tried to be careful,
>> mistakes do happen).
>>
>> I haven't heard any problems so this is looking like it will come in
>> tomorrow after the teleconf. I'll ask again there to see if there are
>> any voices of concern.
>>
>> Cheers,
>> Josh
>>
>> On May 5, 2008, at 9:58 AM, Jeff Squyres wrote:
>>
>>> This all sounds good to me!
>>>
>>> On Apr 29, 2008, at 6:35 PM, Josh Hursey wrote:
>>>
>>>> What: Add mca_base_select() and adjust frameworks & components to
>>>> use
>>>> it.
>>>> Why: Consolidation of code for general goodness.
>>>> Where: https://svn.open-mpi.org/svn/ompi/tmp-public/jjh-mca-play
>>>> When: Code ready now. Documentation ready soon.
>>>> Timeout: May 6, 2008 (After teleconf) [1 week]
>>>>
>>>> Discussion:
>>>> -----------
>>>> For a number of years a few developers have been talking about
>>>> creating a MCA base component selection function. For various
>>>> reasons
>>>> this was never implemented. Recently I decided to give it a try.
>>>>
>>>> A base select function will allow Open MPI to provide completely
>>>> consistent selection behavior for many of its frameworks (18 of 31
>>>> to
>>>> be exact at the moment). The primary goal of this work is to
>>>> improving
>>>> code maintainability through code reuse. Other benefits also result
>>>> such as a slightly smaller memory footprint.
>>>>
>>>> The mca_base_select() function represented the most commonly used
>>>> logic for component selection: Select the one component with the
>>>> highest priority and close all of the not selected components. This
>>>> function can be found at the path below in the branch:
>>>> opal/mca/base/mca_base_components_select.c
>>>>
>>>> To support this I had to formalize a query() function in the
>>>> mca_base_component_t of the form:
>>>> int mca_base_query_component_fn(mca_base_module_t **module, int
>>>> *priority);
>>>>
>>>> This function is specified after the open and close component
>>>> functions in this structure as to allow compatibility with
>>>> frameworks
>>>> that do not use the base selection logic. Frameworks that do *not*
>>>> use
>>>> this function are *not* effected by this commit. However, every
>>>> component in the frameworks that use the mca_base_select function
>>>> must
>>>> adjust their component query function to fit that specified above.
>>>>
>>>> 18 frameworks in Open MPI have been changed. I have updated all of
>>>> the
>>>> components in the 18 frameworks available in the trunk on my branch.
>>>> The effected frameworks are:
>>>> - OPAL Carto
>>>> - OPAL crs
>>>> - OPAL maffinity
>>>> - OPAL memchecker
>>>> - OPAL paffinity
>>>> - ORTE errmgr
>>>> - ORTE ess
>>>> - ORTE Filem
>>>> - ORTE grpcomm
>>>> - ORTE odls
>>>> - ORTE pml
>>>> - ORTE ras
>>>> - ORTE rmaps
>>>> - ORTE routed
>>>> - ORTE snapc
>>>> - OMPI crcp
>>>> - OMPI dpm
>>>> - OMPI pubsub
>>>>
>>>> There was a question of the memory footprint change as a result of
>>>> this commit. I used 'pmap' to determine process memory footprint
>>>> of a
>>>> hello world MPI program. Static and Shared build numbers are below
>>>> along with variations on launching locally and to a single node
>>>> allocated by SLURM. All of this was on Indiana University's Odin
>>>> machine. We compare against the trunk (r18276) representing the last
>>>> SVN sync point of the branch.
>>>>
>>>> Process(shared)| Trunk | Branch | Diff (Improvement)
>>>> ---------------+----------+---------+-------
>>>> mpirun (orted) | 39976K | 36828K | 3148K
>>>> hello (0) | 229288K | 229268K | 20K
>>>> hello (1) | 229288K | 229268K | 20K
>>>> ---------------+----------+---------+-------
>>>> mpirun | 40032K | 37924K | 2108K
>>>> orted | 34720K | 34660K | 60K
>>>> hello (0) | 228404K | 228384K | 20K
>>>> hello (1) | 228404K | 228384K | 20K
>>>>
>>>> Process(static)| Trunk | Branch | Diff (Improvement)
>>>> ---------------+----------+---------+-------
>>>> mpirun (orted) | 21384K | 21372K | 12K
>>>> hello (0) | 194000K | 193980K | 20K
>>>> hello (1) | 194000K | 193980K | 20K
>>>> ---------------+----------+---------+-------
>>>> mpirun | 21384K | 21372K | 12K
>>>> orted | 21208K | 21196K | 12K
>>>> hello (0) | 193116K | 193096K | 20K
>>>> hello (1) | 193116K | 193096K | 20K
>>>>
>>>> As you can see there are some small memory footprint improvements on
>>>> my branch that result from this work. The size of the Open MPI
>>>> project
>>>> shrinks a bit as well. This commit cuts between 3,500 and 2,000
>>>> lines
>>>> of code (depending on how you count) so about a ~1% code shrink.
>>>>
>>>> The branch is stable in all of the testing I have done, but there
>>>> are
>>>> some platforms on which I cannot test. So please give this branch a
>>>> try and let me know if you find any problems.
>>>>
>>>> Cheers,
>>>> Josh
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>> --
>>> Jeff Squyres
>>> Cisco Systems
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
- Pak Lui
pak.lui_at_[hidden]