I forget all the formatting we are supposed to use, so I hope you'll all
just bear with me.
George brought up the fact that we used to have an MCA param to specify a
hostfile to use for a job. The hostfile behavior described on the wiki,
however, doesn't provide for that option. It associates a hostfile with a
specific app_context, and provides a detailed hierarchical layout of how
mpirun is to interpret that information.
What I propose to do is add an MCA param called "OMPI_MCA_default_hostfile"
to replace the deprecated capability. If found, the system's behavior will
1. in a managed environment, the default hostfile will be used to filter the
discovered nodes to define the available node pool. Any hostfile and/or dash
host options provided to an app_context will be used to further filter the
node pool to define the specific nodes for use by that app_context. Thus,
nodes in the hostfile and dash host options given to an app_context -must-
also be in the default hostfile in order to be available for use by that
app_context - any nodes in the app_context options that are not in the
default hostfile will be ignored.
2. in an unmanaged environment, the default hostfile will be used to define
the available node pool. Any hostfile and/or dash host options provided to
an app_context will be used to filter the node pool to define the specific
nodes for use by that app_context, subject to the previous caveat. However,
add-hostfile and add-host options will add nodes to the node pool for use
-only- by the associated app_context.
I believe this proposed behavior is consistent with that described on the
wiki, and would be relatively easy to implement. If nobody objects, I will
do so by end-of-day 3/6.
Comments, suggestions, objections - all are welcome!