Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Change in hostfile behavior
From: Ralph Castain (rhc_at_[hidden])
Date: 2008-07-29 08:43:40


Lenny's point is true - except for the danger of setting that mca
param and its possible impact on ORTE daemons+mpirun - see other note
in that regard. However, it would only be useful if the same user was
doing it.

I believe Tim was concerned about the case where two users are sharing
nodes. There is no good solution for that case. Two mpiruns done by
different users that share a node and who have no knowledge of the
other's actions will cause collision.

We should probably warn about that in our FAQ or something since that
is a fairly common use-case - only thing I can think of is to
recommend people default to running without affinity and only set it
when they -know- they have sole use of their nodes.

On Jul 29, 2008, at 12:17 AM, Lenny Verkhovsky wrote:

> for two separate runs we can use slot_list parameter
> ( opal_paffinity_base_slot_list ) to have paffinity
>
> 1: mpirun -mca opal_paffinity_base_slot_list "0-1"
>
> 2 :mpirun -mca opal_paffinity_base_slot_list "2-3"
>
>
> On 7/28/08, Ralph Castain <rhc_at_[hidden]> wrote:
> Actually, this is true today regardless of this change. If two
> separate mpirun invocations share a node and attempt to use
> paffinity, they will conflict with each other. The problem isn't
> caused by the hostfile sub-allocation. The problem is that the two
> mpiruns have no knowledge of each other's actions, and hence assign
> node ranks to each process independently.
>
> Thus, we would have two procs that think they are node rank=0 and
> should therefore bind to the 0 processor, and so on up the line.
>
> Obviously, if you run within one mpirun and have two app_contexts,
> the hostfile sub-allocation is fine - mpirun will track node rank
> across the app_contexts. It is only the use of multiple mpiruns that
> share nodes that causes the problem.
>
> Several of us have discussed this problem and have a proposed
> solution for 1.4. Once we get past 1.3 (someday!), we'll bring it to
> the group.
>
>
>
> On Jul 28, 2008, at 10:44 AM, Tim Mattox wrote:
>
> My only concern is how will this interact with PLPA.
> Say two Open MPI jobs each use "half" the cores (slots) on a
> particular node... how would they be able to bind themselves to
> a disjoint set of cores? I'm not asking you to solve this Ralph, I'm
> just pointing it out so we can maybe warn users that if both jobs
> sharing
> a node try to use processor affinity, we don't make that magically
> work well,
> and that we would expect it to do quite poorly.
>
> I could see disabling paffinity and/or warning if it was enabled for
> one of these "fractional" nodes.
>
> On Mon, Jul 28, 2008 at 11:43 AM, Ralph Castain <rhc_at_[hidden]> wrote:
> Per an earlier telecon, I have modified the hostfile behavior
> slightly to
> allow hostfiles to subdivide allocations.
>
> Briefly: given an allocation, we allow users to specify --hostfile
> on a
> per-app_context basis. In this mode, the hostfile info is used to
> filter the
> nodes that will be used for that app_context. However, the prior
> implementation only filtered the nodes themselves - i.e., it was a
> binary
> filter that allowed you to include or exclude an entire node.
>
> The change now allows you to include a specified #slots for a given
> node as
> opposed to -all- slots from that node. You are limited to the #slots
> included in the original allocation. I just realized that I hadn't
> output a
> warning if you attempt to violate this condition - will do so shortly.
> Rather than just abort if this happens, I set the allocation to that
> of the
> original - please let me know if you would prefer it to abort.
>
> If you have interest in this behavior, please check it out and let
> me know
> if this meets needs.
>
> Ralph
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>
>
> --
> Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/
> tmattox_at_[hidden] || timattox_at_[hidden]
> I'm a bright... http://www.the-brights.net/
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel