Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

From: Adrian Knoth (adi_at_[hidden])
Date: 2006-09-01 12:21:36


On Fri, Sep 01, 2006 at 07:01:25AM -0600, Ralph Castain wrote:

> > Do you agree to go on with two oob components, tcp and tcp6?
> Yes, I think that's the right approach

It's a deal. ;)

> I think this can be supported nicely in the framework system. All we
> have to do is set the IPv6 component's priority higher than IPv4.

Do you mean that priority?:

   MCA oob: parameter "oob_tcp6_priority" (current value: "0")

> We then can deal with the "try IPv6 first" by traversing the component
> list in priority order. As an example, see the RAS framework.

Where is it done? It's outside the mca/oob directory, right?
My knowledge about orte is currently more or less limited to
this subdirectory ;)
 
> it. In this case, we need both OOB components active, and we need a routing
> table that tells us which one to use to talk to various processes. I suspect
> the routing table belongs in the RML framework. If you look at the PLS
> framework, you'll see where we "front" the select function to give you the
> ability to specify a preferred selection. We might have to do the same thing
> with the OOB to allow the RML to say "send this buffer using this specific
> OOB component", while still allowing it to say "send this buffer using the
> *best* component".

Sounds good (but I don't have to do it on my own, do I?).

Right now it looks like this:

   orterun -np 2 -host hostA,hostB some_command

uses IPv4 and it is still working.

   orterun -mca oob ^tcp hostA,hostB some_command

hangs. The HNP correctly generated the tcp6://-URIs, but I guess
the remote node tries to connect with its oob/tcp module (which
cannot handle IPv6 anymore).

So I chmod 0 the mca_oob_tcp.so to prevent its loading, thus resulting
in a working IPv6 connection.

(for now, I don't know why this happens (the hang), but at least
 the oob/tcp6 component is working at all)

> I suspect that backend processes (i.e., non-HNP processes) really will
> only use one or the other.

The question also arises for the btl/tcp component: if all nodes
should be able to communicate with each other, they must use the
same address family.

Thanks for your help.

-- 
mail: adi_at_[hidden]  	http://adi.thur.de	PGP: v2-key via keyserver
Person1: Geil. Morgen um 9 muss ich Präsentation halten. ÖRKS!
Person2: Morgen um 9 werde ich eine Kaffeetasse halten.