Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] MCA component open
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-05-04 07:58:17


FWIW, Josh implemented "MCA-NULL" in https://svn.open-mpi.org/trac/ompi/changeset/18364
.

I'm not sure how I feel about this solution. On the one hand, it's
kind of a hack-ish way of solving the immediate issue. On the other
hand, it's really a larger issue of explicitly *not* setting an MCA
param (or knowing what source an MCA value originated from, depending
on how you look at it), something that we've never taken the time to
address properly. If we continue to not solve the larger issue, it's
going to come up again someday and someone will add yet another
workaround.

In both dimensions:

- I'm not entirely sure I understand the specific ORTE issue. Is it
that you want one "plm" MCA param value for mpirun and other value for
other processes (i.e., the orteds)? Or, more specifically, you want
plm X in mpirun, and *no* PLM's in the orteds?

- Would adding an enum indicating where an MCA value was retrieved
from help this situation? E.g., MCA_PARAM_ENVIRONMENT,
MCA_PARAM_FILE, MCA_PARAM_DEFAULT?

On May 3, 2008, at 12:02 PM, George Bosilca wrote:

>
> The problem: The orted open all plm before discarding most of them,
> all this in the context where a "--mca plm rsh" was present on the
> mpirun invocation.
>
> The non problem: In the context of the mpirun process, only the rsh
> plm is opened, as the mpirun is the only process who get the "--mca
> plm rsh" information. As this specific argument is not included on
> the list of arguments we forward to the orted processes, there is no
> way that the orted can abide to the imposed restriction. Note that
> if the restriction is inserted in the config file, then even the
> orted respect it. So far the only problem I can see here, is that
> the orted are opening a framework that they are not supposed to (at
> least not in most of the cases).
>
> When we implemented the MCA filtering stuff, we proposed another
> optimization. More specifically, a default component for all special
> frameworks (i.e. used or not based on the type of process) that will
> be statically linked inside the library (and therefore will not
> generate any NFS traffic). Its only goal was to execute the
> selection logic when any of its functions were called, in other
> words on-demand component loading feature. Starting from there, a
> real component will be selected, and all other calls to this
> component will be directed to the selected component. I perfectly
> remember that Ralph was completely against this feature for two
> reasons: 1) all components in the ORTE framework had to be loaded
> and they will do the "if(!hnp) return NULL"; 2) he proposed to
> implement the null component.
>
> I was and I'm still against 1) so I guess that any effort toward
> implementing a null or none component will have my support.
>
> george.
>
> On May 2, 2008, at 4:40 PM, Josh Hursey wrote:
>
>> We could also call it 'null' for the empty set of components? Or
>> maybe
>> OMPI-NULL.
>>
>> Outside of the naming do others this this is a useful feature to
>> implement?
>>
>> -- Josh
>>
>> On May 2, 2008, at 10:51 AM, Ralph Castain wrote:
>>
>>> I would think that adding a special keyword would be the correct
>>> method. I
>>> would suggest something with an "ompi" in it, perhaps capitalized so
>>> there
>>> is no confusion...something like "OMPI-NONE"?
>>>
>>>
>>> On 5/2/08 8:37 AM, "Josh Hursey" <jjhursey_at_[hidden]> wrote:
>>>
>>>> I don't believe we have the logic in place to tell
>>>> mca_component_open
>>>> 'do not open anything'. (I could be wrong though).
>>>>
>>>> Adding such an option might be useful, but we would have to
>>>> consider
>>>> how that option should be specified by the user. Currently if you
>>>> do
>>>> not set a value (leave empty space in mca-params.conf) then the MCA
>>>> system takes this to indicate that all components are eligible for
>>>> selection. If you specify any options then only those options
>>>> should
>>>> be opened. We could add a special keyword (such as 'none') to
>>>> indicate
>>>> 'open nothing'.
>>>>
>>>> What do people think about that?
>>>>
>>>> -- Josh
>>>>
>>>>
>>>> On May 2, 2008, at 10:22 AM, Ralph Castain wrote:
>>>>
>>>>> I see what the problem is. In the case of slurm, I don't want -
>>>>> any-
>>>>> components to be opened, even though I am going to call plm open/
>>>>> select. I
>>>>> have to leave that logic in place for those environments that -do-
>>>>> want to
>>>>> specify some backend secondary launcher.
>>>>>
>>>>> So the question is: how do I tell mca_component_open "do not open
>>>>> anything"?
>>>>>
>>>>> If we don't have a mechanism for doing that, can we create one?
>>>>>
>>>>>
>>>>> On 5/2/08 8:02 AM, "Ralph Castain" <rhc_at_[hidden]> wrote:
>>>>>
>>>>>> Well, I have a current version of the trunk. I add an MCA param
>>>>>> to
>>>>>> the
>>>>>> environment indicating that only rsh is to be used by the orted.
>>>>>> Yet I get
>>>>>> an output from every orted indicating that slurm (misspelled!) is
>>>>>> available
>>>>>> for selection.
>>>>>>
>>>>>> This tells me that the slurm component is being opened, even
>>>>>> though
>>>>>> the
>>>>>> param is set.
>>>>>>
>>>>>> I can check again to ensure that the param is set...
>>>>>>
>>>>>>
>>>>>> On 5/2/08 7:53 AM, "Jeff Squyres" <jsquyres_at_[hidden]> wrote:
>>>>>>
>>>>>>> (moving to devel list for wider audience)
>>>>>>>
>>>>>>> Hmm. I thought the UTK stuff from a while ago supposedly
>>>>>>> changed
>>>>>>> this
>>>>>>> behavior to only open the components that were specifically
>>>>>>> requested.
>>>>>>>
>>>>>>> This behavior looks like the *original* MCA behavior -- open
>>>>>>> them
>>>>>>> all,
>>>>>>> then discard what we don't want (but doesn't necessarily reclaim
>>>>>>> the
>>>>>>> memory because of how dlclose works).
>>>>>>>
>>>>>>>
>>>>>>> On May 2, 2008, at 9:48 AM, Ralph Castain wrote:
>>>>>>>
>>>>>>>> Yo guys
>>>>>>>>
>>>>>>>> I've noticed something on the trunk that just doesn't strike me
>>>>>>>> as
>>>>>>>> correct.
>>>>>>>> If I specify "-mca plm rsh", it is my expectation that (a) only
>>>>>>>> the
>>>>>>>> rsh
>>>>>>>> component will be opened, and (b) only the rsh module will be
>>>>>>>> selected,
>>>>>>>> unless that component indicates that it cannot run.
>>>>>>>>
>>>>>>>> What I am seeing, though, is that -all- the plm components are
>>>>>>>> being
>>>>>>>> opened.
>>>>>>>> This is not only unnecessary, but consumes memory and leads to
>>>>>>>> concern over
>>>>>>>> whether or not some other module could become active.
>>>>>>>>
>>>>>>>> Is this the intended behavior? If so, may I suggest we change
>>>>>>>> it in
>>>>>>>> Josh's
>>>>>>>> branch prior to bringing it over?
>>>>>>>>
>>>>>>>> Ralph
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> devel_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Jeff Squyres
Cisco Systems