Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [OMPI svn] svn:open-mpi r21504
From: Ralph Castain (rhc_at_[hidden])
Date: 2009-06-25 16:03:04


On Jun 25, 2009, at 9:30 AM, Iain Bason wrote:

>
> On Jun 25, 2009, at 11:10 AM, Ralph Castain wrote:
>
>> They do flow along the route at all times. However, without static
>> ports the orted has to start by directly connecting to the HNP and
>> sending the orted's contact info to the HNP.
>
> This is the part I don't understand. Why can't they send the
> contact info along the route as well? Don't they have enough
> information to wire a route to the HNP? If not, can't they be given
> it at startup?

If you have a tree oob routing topology, and you launch -all- the
daemons with a single command (which all environments except rsh do),
how does a child in the tree know the contact info of its parent?? It
can't - unless you use static ports (so it can compute the IP/port),
or you do an exchange like we currently do.

If you do a tree-spawn with rsh, then you obviously can do so. We
didn't do that in the original tree spawn because we didn't have the
required infrastructure at that time to shutdown a job where the
orteds didn't have a direct route to the HNP. George et al are working
on this because we -do- have the ability to do it now.

>
>> Then the HNP includes that info in the launch msg, allowing the
>> orteds to wireup their routes.
>
>>
>> So the difference is that the static ports allow us to avoid that
>> initial HNP-direct connection, which is what causes the flood.
>
> I should warn everyone that in my experiments the HNP flood is not
> the only problem with tree spawning. In fact, it doesn't even seem
> to be the limiting problem. At the moment, it appears that the
> limiting problem on my cluster has to do with sshd/rshd accessing
> some name service (e.g., gethostbyname, getpwnam, getdefaultproject,
> or something like that).

Quite true - you have a number of things that happen on each node as
you launch down the tree. Part of the solution is to cache your DNS
lookups so they are all local. If you don't, then things run slow -
which is what we were seeing 1+ years ago when playing on Ranger.

>
> I am hoping to find that this is just some cluster configuration
> oddity. YMMV, of course.
>>
>> The other thing that hasn't been done yet is to have the "procs-
>> launched" messages rollup in the collective - the HNP gets one/
>> daemon right now, even though it comes down the routed path. Hope
>> to have that done next week. That will be in operation regardless
>> of static vs non-static ports.
>
> Great!
>
> Iain
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel