Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Adrian Knoth (adi_at_[hidden])
Date: 2006-03-31 09:59:42


On Fri, Mar 31, 2006 at 09:07:39AM -0500, Brian Barrett wrote:

> > I have a first quick and dirty patch, replacing AF_INET by AF_INET6,
> > the sockaddr_in structs and so on.
> Is there a way to do this to better support both IPv4 and IPv6?

I think so, too. There are probably two different ways to achieve
this: either provide two components "tcp" and "tcp6" or use
v6-mapped-v4 addresses. The first would surely result in a lot
of shared code, but I think this won't be a problem. If it is
possible to have to components (and by this several modules)
for communication, this might be a solution.

The other way, v6-mapped-v4, is how normal userland daemons
are usually implemented. The application only listens on
v6-sockets, v4-addresses are mapped to ::ffff:a.b.c.d/96,
where a.b.c.d is the normal 32bit v4-address:

Mar 31 13:58:26 ltw pop3-login: Login: xxxxx [::ffff:84.184.164.40]

Perhaps it's a good idea to port any internal structure to
IPv6, as it is able to represent the whole v4 namespace.
One can always determine whether it is a real v6 or only
a mapped v4 address (the common ::ffff: prefix)

> mca_btl_tcp_proc_insert(), which is what I think you're referring to
> by the net1/net2 code, that's intended to be used to try to get all
> the multi-nic scenarios wired up in the most advantageous way
> possible. So we look at the combination IPv4 addr and netmask and
> prefer to connect two endpoints in the same subnet.

Ok, this is how I understood the code. The current implementation
does a bitwise AND on uint32, for IPv6 this will be 128 bits.

I don't know of any predeclared type of this size, so we have
to find a different solution. Though the final decision will
always be boolean ("Are we on the same network?" Yes/No), we
have to represent the correct answer.

There is only one comparision between net1 and net2, so the
decision is a local one and we don't really need the
netmasks.

> I'm not sure how IPv6 deals with netmasks and routing, but I'm
> assuming there would be something similar.

Pretty much the same. Netmasks are now called "prefixlen",
integers between 0 (like /0) and 128 (like /32).
The typical onlink prefixlen is /64, there's usually no
smaller (i.e. /112) prefixlen, though it might exist.

Routing aggregation is done by enlarging the prefix.
A typical one is /48, this means 2^16 networks with 2^64
hosts each.

So to say: the LAN prefixlen will be 64 in most cases.
Larger ones (i.e. /48) are only for routing.

I apologize for calling the numerical smaller value of 48
the larger prefix than 64. This just refers to the network
size as the /64 is the smaller network.

> > I don't know if this patched tcp-component can handle
> > IPv6 connections, I've never tested it. I think it
> > even breaks IPv4 functionality; we should make clear
> > how IPv4 and IPv6 may work in parallel (or may not, if
> > one considers IPv4 deprecated ;)
> From a practical standpoint, Open MPI has to support both IPv4 and
> IPv6 for the foreseeable future.

I think so, too. We're dual stacked.

> We currently try to wire up one connection per "IP device", so it
> seems like we should be able to find some way to automatically
> switch between IPv6 or IPv4 based on what we determine is available
> on that host, right?

That's right. The orte-oob seems to be the right place for
this decision, assuming that ompi/mca/btl/tcp can handle
both or have two different components providing the desired
functionality.

Implementing this dual stack behaviour isn't that hard, almost
every userland tool does it this way: try the v6 and if it
fails, use v4. The user can usually force the code to use
either v4 or v6. This shouldn't be too hard in case of
v6-mapped-v4. The only thing to take care is for RFC1918 networks.

adi_at_drcomp:~$ telnet ::ffff:127.0.0.1 25

(works fine)

To automatically select the right protocol, it might be good
to prefer IPv4 (smaller headers->less overhead). The user
can still force the use of IPv6 via DNS (assigning special
IPv6-only hostnames)

-- 
mail: adi_at_[hidden]  	http://adi.thur.de	PGP: v2-key via keyserver
Lieber einen Spanner im Garten als garkein Strom!