Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [RFC] Default hostfile MCA param
From: Ralph H Castain (rhc_at_[hidden])
Date: 2008-03-03 12:53:41


I personally have no objection, but I would ask then that the wiki be
modified to cover this case. All I require is that someone define the syntax
to be used to indicate "this is a node I do -not- want used", or
alternatively a flag that indicates "all nodes below are -not- to be used".

Implementation isn't too hard once I have that...

On 3/3/08 9:44 AM, "Edgar Gabriel" <gabriel_at_[hidden]> wrote:

> Ralph,
>
> could this mechanism be used also to exclude a node, indicating to never
> run a job there? Here is the problem that I face quite often: students
> working on the homework forget to allocate a partition on the cluster,
> and just type mpirun. Because of that, all jobs end up running on the
> front-end node.
>
> If we would have now the ability to specify in a default hostfile, to
> never run a job on a specified node (e.g. the front end node), users
> would get an error message when trying to do that. I am aware that
> that's a little ugly...
>
> THanks
> edgar
>
> Ralph Castain wrote:
>> I forget all the formatting we are supposed to use, so I hope you'll all
>> just bear with me.
>>
>> George brought up the fact that we used to have an MCA param to specify a
>> hostfile to use for a job. The hostfile behavior described on the wiki,
>> however, doesn't provide for that option. It associates a hostfile with a
>> specific app_context, and provides a detailed hierarchical layout of how
>> mpirun is to interpret that information.
>>
>> What I propose to do is add an MCA param called "OMPI_MCA_default_hostfile"
>> to replace the deprecated capability. If found, the system's behavior will
>> be:
>>
>> 1. in a managed environment, the default hostfile will be used to filter the
>> discovered nodes to define the available node pool. Any hostfile and/or dash
>> host options provided to an app_context will be used to further filter the
>> node pool to define the specific nodes for use by that app_context. Thus,
>> nodes in the hostfile and dash host options given to an app_context -must-
>> also be in the default hostfile in order to be available for use by that
>> app_context - any nodes in the app_context options that are not in the
>> default hostfile will be ignored.
>>
>> 2. in an unmanaged environment, the default hostfile will be used to define
>> the available node pool. Any hostfile and/or dash host options provided to
>> an app_context will be used to filter the node pool to define the specific
>> nodes for use by that app_context, subject to the previous caveat. However,
>> add-hostfile and add-host options will add nodes to the node pool for use
>> -only- by the associated app_context.
>>
>>
>> I believe this proposed behavior is consistent with that described on the
>> wiki, and would be relatively easy to implement. If nobody objects, I will
>> do so by end-of-day 3/6.
>>
>> Comments, suggestions, objections - all are welcome!
>> Ralph
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel