Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Josh Hursey (jjhursey_at_[hidden])
Date: 2007-03-02 15:37:01


On Mar 2, 2007, at 12:18 PM, George Bosilca wrote:

> I think I miss the discussion about these AMCA here at UTK and about
> the benefits that give us. Anyway, I have some comments about this
> patch.
>
> You seems to add the new AMCA files into the same string as the
> default MCA param file and then you call your new function
> fixup_files. This function take a directory as an argument, and you
> will try to match everything with the path coming from an MCA
> parameter described as AMCA specific. Doesn't really make sense to
> me ! If the prefix is AMCA specific then don't match the MCA param
> files with it, if not then correct the help message.

Actually I think you have miss-read the code if I understand what you
are saying correctly. We are talking about file "opal/mca/base/
mca_base_param.c". On Line 213 we call the fixup_files() function
passing it the list of AMCA files (-am command line parameter), and
the AMCA specific search path. This will resolve the AMCA files using
this search path. The variable new_agg_path contains the AMCA
specific path which is kept separate from the default MCA param file
variable 'new_files'. That is until *after* we have resolved all
relative AMCA files. At that point (Line 223) we prepend the default
MCA param file list with the specified AMCA parameter files.

So it is doing exactly as it should, using the AMCA specific path to
resolve only the AMCA parameter sets, never the default MCA parameter
files.

>
> Last thing about this patch. Having the opal MCA layer export a bool
> variable just to make sure the life of orted and orterun (which in
> fact don't really need it as it set it multiple times to true ???) is
> much easier, isn't something that look to me like a good approach.

It is really not making the life of the orted/orterun easier as much
as it is suppressing a warning from being raised when the user
specifies a relative path that needs to be resolved in the current
working directory (e.g., -am ../adir/my-amac.conf). We need a way to
tell the MCA layer that it shouldn't try to resolve the AMCA stuff
because all of the environment variables are not setup properly yet.

It gets a bit tricky since the orted/orterun processes have to kind
of bootstrap the MCA layer a bit due to command line arguments. Upon
first entering the mca_base_param_init() function the system has only
part of the information [environment variables mostly], but nothing
from the command line. The orted/orterun processes then parse the
command line therefore seeding the MCA layer with the 'correct'
information from the command line. Once we have the correct
information we want to recache the files
(mca_base_param_recache_files function) using the user provided
information. So if we don't have some way of telling the MCA layer
that it should not raise a warning about AMCA files that were not
found it is possible that when we get certain relative paths that the
MCA layer will raise a warning on the first pass through the library
(because it doesn't have the complete information yet), but not on
the second. Therefore confusing the end user about what happened.

So this is a long way of saying this is the best way I could think of
to do this without changing a whole lot of code and more interfaces
in the MCA layer.

>
> In fact, I was wondering what is the real difference between having
> this new AMCA stuff and extending the mca_param_files default MCA
> parameter ?

Nothing much other than the way it expresses itself to the end user.
As I mentioned in my original email the default MCA parameter files
and the AMCA parameter set files are the same format, and only differ
in when they are used. The default MCA parameter files are used *all*
the time on every run. The AMCA parameter set files are only used
when the user explicitly asks for them on the command line. As you
may have noticed in the code they are parsed and processed in the
same way. But by exposing special MCA parameters for the AMCA file
sets and the AMCA special path we can logically separate them so the
default MCA parameter files are in one place and the AMCA parameter
set files are in another place on the system. This way it is a bit
clearer that the AMCA parameter sets are opt-in functionality.

Certainly the end user could specify another file to use in addition
to the default MCA parameter files (mca_param_files), but then they
must also specify the other locations that already exist in that path
(e.g., $HOME/.openmpi/mca-params.conf:$SYSCONFDIR/openmpi-mca-
params.conf). This is a short cut in a sense, so the end user doesn't
have to know all of this uglyness every time they want to run a
benchmark, or ...

Hopefully that explains things a bit more, sorry if it was overly
confusing.

-- Josh

>
> Thanks,
> george.
>
> On Mar 1, 2007, at 8:52 AM, Josh Hursey wrote:
>
>> Developers,
>>
>> I just committed back to the trunk the Aggregate MCA (AMCA) Parameter
>> Set work that Jeff Squyres and I have been working on. This will be a
>> part of the eventual v1.3 release.
>>
>> The motivation for creating AMCA parameter sets came from the
>> realization that for certain applications a large number of MCA
>> parameters needed to be set for the job to run well and/or as the
>> user expects. So the goal of this work was to help reduce the number
>> of MCA parameters that the user has to manage, therefore leading to a
>> better end user experience with Open MPI.
>>
>> AMCA parameter set files are formated exactly like the "~/.openmpi/
>> mca-params.conf" configuration files. In addition when AMCA parameter
>> sets are used the user may still override the parameters on the
>> command line if they like.
>>
>> For example, let's say there is a set of MCA parameters that a user
>> would need to set to get good performance out of Netpipe when using
>> Open IB. They would typically run the application as:
>> shell$ mpirun -np 2 NPmpi
>>
>> To use the AMCA parameter set for Open IB the user would run:
>> shell$ mpirun -np 2 -am btl-openib-benchmark NPmpi
>>
>> This will load a series of MCA parameters for the user. If they
>> wanted to override the max_btls MCA parameter for tuning reasons they
>> would run:
>> shell$ mpirun -np 2 -am btl-openib-benchmark -mca
>> btl_open_ib_max_btls 10 NPmpi
>>
>> AMCA parameter sets can be coupled. If we take the example above and
>> wanted to also use an AMCA parameter set for TCP, the user would run:
>> shell$ mpirun -np 2 -am btl-openib-benchmark:btl-tcp-benchmark -
>> mca btl_open_ib_max_btls 10 NPmpi
>>
>> The AMCA parameter sets are loaded in priority order. This means that
>> the OpenIB AMCA set has priority over the TCP AMCA set. So if the TCP
>> AMCA sets the MCA parameter "mpi_leave_pinned=0" and the OpenIB AMCA
>> sets it to "mpi_leave_pinned=1" then the latter, OpenIB version, will
>> be used.
>>
>> Additional Related MCA parameters:
>> - mca_base_param_file_prefix
>> (Default: NULL)
>> This is the fullname of the "-am" mpirun option. Used to
>> specify a ':' separated list of AMCA parameter set files.
>> - mca_base_param_file_path
>> (Default: $SYSCONFDIR/amca-param-sets/:$CWD)
>> The path to search for AMCA files with relative paths. A
>> warning will be printed if the AMCA file cannot be found.
>>
>>
>> If you have any problems with this new feature let me know. There
>> will be an FAQ coming shortly about this I suspect.
>>
>> Cheers,
>> Josh
>>
>>
>> ----
>> Josh Hursey
>> jjhursey_at_[hidden]
>> http://www.open-mpi.org/
>>
>> ----
>> Josh Hursey
>> jjhursey_at_[hidden]
>> http://www.open-mpi.org/
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> "Half of what I say is meaningless; but I say it so that the other
> half may reach you"
> Kahlil Gibran
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

----
Josh Hursey
jjhursey_at_[hidden]
http://www.open-mpi.org/