On Jun 13, 2007, at 1:48 PM, Gleb Natapov wrote:
>> 3. Use a file to convey this information, because it's better suited
>> to what we're trying to do (vs. MCA parameters).
>> Seriously, why is a file a bad thing? The file can list interfaces
>> by hostname. For example, if you have a heterogeneous setup, what's
>> to say that having btl_tcp_bandwidth_eth0 is not the same across all
>> your hosts? That is -- the MCA parameters you're providing are not
>> sufficient for a true heterogeneous environment, anyway.
> I don't feel strongly one way or the other. The command line approach
> was much easier to implement. Is it possible to have one parser for
> BTLs or each one will have to implement different one?a
Let's take a step back and see exactly what we *want*. Then we can
talk about how to have an interface for it.
1. We want to be able to specify bandwidth/latency values for BTL
modules (and possibly other kinds of modules).
2. For the common case, we want to be able to specify a single [set
of] value[s] that apply uniformly across the MPI job. This already
exists in MCA parameters today.
3. For another common case, we want to be able to specify a small set
of values that apply uniformly to specific interfaces across the MPI
job (e.g., specify different values for eth0 and eth1). This exists
today in variable MCA parameters.
4. For another case (possibly uncommon?), we want to be able to
specify different values for different interfaces on different
hosts. This exists today by having different MCA parameter files on
each host and pairing it with #3. It's not exactly convenient, but
If we agree that these are the things that we want, then I think #3
is the contentious area (I don't like variable MCA params that don't
show up in ompi_info), and #4 could certainly be made more convenient
(note that I previously said #4 was not possible, but I thought about
it more and realized that it *is*; it's just not convenient as, for
example, a single file that lists all hosts and their individual
settings that can be replicated across a cluster). Indeed #3 could
be combined with a more-convenient #4 and solve all the problems.
If you can agree to that, then I propose a simple INI-style text file
that aggregates MCA parameters based on hostname. The INI section
names are hostnames, but we support simple, shell-like regular
expressions (e.g., * and ?). Consider mca-params.ini:
btl_tcp_if_include = eth1
btl_tcp_if_include = eth0,ib0
btl_tcp_bandwidth = eth0=1000,ib0=2000
btl_tcp_if_include = eth0,myri0
btl_tcp_bandwidth = eth0=1000,mryi0=2000
More specifically, I'm proposing two things:
1. The MCA system itself accept this ini-style file that keys off
hostnames so that this works across all of Open MPI.
2. The bandwidth/latency MCA params accept values in two forms:
- a single integer
- comma-delimited list of <interface>=<value> pairs
> BTW ompi_info will not parse this file too, so it will not be able to
> present correct bandwidth/latency value just like command line
> For heterogeneous config file is the only option of cause.
True. But I think it's a reasonable expectation that ompi_info
should show all user-available MCA parameters. It doesn't claim to
show data files (like the HCA params file).