Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Adrian Knoth (adi_at_[hidden])
Date: 2006-09-01 12:21:36


On Fri, Sep 01, 2006 at 07:01:25AM -0600, Ralph Castain wrote:

> > Do you agree to go on with two oob components, tcp and tcp6?
> Yes, I think that's the right approach

It's a deal. ;)

> I think this can be supported nicely in the framework system. All we
> have to do is set the IPv6 component's priority higher than IPv4.

Do you mean that priority?:

   MCA oob: parameter "oob_tcp6_priority" (current value: "0")

> We then can deal with the "try IPv6 first" by traversing the component
> list in priority order. As an example, see the RAS framework.

Where is it done? It's outside the mca/oob directory, right?
My knowledge about orte is currently more or less limited to
this subdirectory ;)
 
> it. In this case, we need both OOB components active, and we need a routing
> table that tells us which one to use to talk to various processes. I suspect
> the routing table belongs in the RML framework. If you look at the PLS
> framework, you'll see where we "front" the select function to give you the
> ability to specify a preferred selection. We might have to do the same thing
> with the OOB to allow the RML to say "send this buffer using this specific
> OOB component", while still allowing it to say "send this buffer using the
> *best* component".

Sounds good (but I don't have to do it on my own, do I?).

Right now it looks like this:

   orterun -np 2 -host hostA,hostB some_command

uses IPv4 and it is still working.

   orterun -mca oob ^tcp hostA,hostB some_command

hangs. The HNP correctly generated the tcp6://-URIs, but I guess
the remote node tries to connect with its oob/tcp module (which
cannot handle IPv6 anymore).

So I chmod 0 the mca_oob_tcp.so to prevent its loading, thus resulting
in a working IPv6 connection.

(for now, I don't know why this happens (the hang), but at least
 the oob/tcp6 component is working at all)

> I suspect that backend processes (i.e., non-HNP processes) really will
> only use one or the other.

The question also arises for the btl/tcp component: if all nodes
should be able to communicate with each other, they must use the
same address family.

Thanks for your help.

-- 
mail: adi_at_[hidden]  	http://adi.thur.de	PGP: v2-key via keyserver
Person1: Geil. Morgen um 9 muss ich Präsentation halten. ÖRKS!
Person2: Morgen um 9 werde ich eine Kaffeetasse halten.