On Tue, Mar 23, 2010 at 10:25 AM, Nicolas Niclausse
> I'm trying to run openmpi (1.4.1) on two clusters; on each cluster, several
> interfaces are private;
> on cluster1, nodes have 3 interfaces, and only 192.168.159.0/24 is visible
> from cluster2.
> eth0 inet addr:192.168.160.76 Bcast:192.168.160.255 Mask:255.255.255.0
> eth1 inet addr:192.168.159.76 Bcast:192.168.159.255 Mask:255.255.255.0
> myri0 inet addr:192.168.162.76 Bcast:192.168.162.255 Mask:255.255.255.0
> on cluster2, nodes have 3 interfaces, and only 172.24.110.0/17 is visible
> from cluster1
> eth0 inet addr:172.24.190.8 Bcast:172.24.191.255 Mask:255.255.192.0
> eth1 inet addr:172.24.110.8 Bcast:172.24.127.255 Mask:255.255.128.0
> eth2 inet addr:172.24.240.8 Bcast:172.24.255.255 Mask:255.255.192.0
> so i'm using this to declare all the other networks as private:
> mpirun -machinefile ~/gridnodes --mca opal_net_private_ipv4
> but this doesn't work:
Have you tried -mca btl_tcp_if_include/exclude?
> Why openmpi tries to connect different private networks, given that
> "public" networks exists ? is it a bug or am i missing something ?
>From what I've seen, I believe OpenMPI tries to find the fastest route
to the nodes. In some cases it's trivial to sort that out, in other
cases you might need to give it some hints.