It's been awhile, but I vaguely remember the discussion. IIRC, the rationale was that the default hostfile was equivalent to an RM allocation and should be treated the same. So hostfile and -host become filters in that case.
FWIW, I believe the discussion was split on that question. I added a "none" option to the default hostfile MCA param so it would be ignored in the case where (a) the sys admin has given a default hostfile, but (b) someone wants to use hosts outside of it.
MCA orte: parameter "orte_default_hostfile" (current value: <none>, data source: default value)
Name of the default hostfile (relative or absolute path, "none" to ignore environmental or default MCA param setting)
That said, I can see a use-case argument for behaving somewhat differently. We've even had cases where users have gotten an allocation from an RM, but want to add hosts that are external to the cluster to the job.
It would be rather trivial to modify the logic:
1. read the default hostfile or RM allocation for our baseline
2. remove any hosts on that list that are *not* in the given hostfile
3. add any hosts that are in the given hostfile, but weren't in the default hostfile
And subsequently do the same for -host. I think that would retain the spirit of the discussion, but provide more flexibility and provide a tad more "expected" behavior.
I don't have an iron in this fire as I don't use hostfiles, so I'm happy to implement whatever the community would like to see.
On Jul 27, 2012, at 6:30 PM, George Bosilca wrote:
> I'm somewhat puzzled by the behavior of the -hostfile in Open MPI. Based on the FAQ it is supposed to provide a list of resources to be used by the launcher (in my case ssh) to start the processes. Make sense so far.
> However, if the configuration file contain a value for orte_default_hostfile, then the behavior of the hostfile option change drastically, and the option become a filter (the machines must be on the original list or a cryptic error message is displayed).
> Overall, we have a well defined [mostly] consistent behavior for parameters in Open MPI. We have an order of precedence of sources of MCA parameters, clearly defined which make understanding where a value comes straightforward. I'm absolutely certain there was a group discussion about this unique "eccentricity" regarding the hostfile option, but I fail to remember what was the reason we decided to go this way. Can I have a quick refresh please?
> devel mailing list