Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] use additional interface for openmpi
From: worldeb_at_[hidden]
Date: 2009-09-29 09:58:42


 Hi,
 
> Open MPI should just "figure it out" and do the Right Thing at run-
> time -- is that not happening?
you are right it should.
But I want to exclude any traffic from OpenMPI communications, like NFS, traffic from other jobs and so on.
And use only special ethernet interface for this purpose.

So I have OpenMPI 1.3.3 installed on all nodes and head node in the same directory.
OS is the same on all cluster - debian 5.0
On nodes I have two interfaces eth0 - for NFS and so on...
and eht1 for OpenMPI.
On head node I have 5 interfaces: eth0 for NFS, eth4 for OpenMPI
Network is next:
1) Head node eht0 + nodes eht0 : 192.168.0.0/24
2) Head node eth4 + nodes eth1 : 192.168.1.0/24

So how I can configure OpenMPI for using only network 2) for my purpose?
It is one question.

Other problem is next:
I try to run some examples. But unfortunately it is not work.
My be it is not correctly configured network.

I can submit any jobs only on one host from this host.
When I submit from head node for example to other nodes it hangs without any messages.
And on node where I want to calculate I see that here is started orted daemon.
(I use default config files)

Below is examples:
mpirun -v --mca btl self,sm,tcp --mca btl_base_verbose 30 --mca btl_tcp_if_include eth0 -np 2 -host n10,n11 cpi
no output, no calculations, only orted daemon on nodes

mpirun -v --mca btl self,sm,tcp --mca btl_base_verbose 30 -np 2 -host n10,n11 cpi
the same as abowe

mpirun -v --mca btl self,sm,tcp --mca btl_base_verbose 30 -np 2 -host n00,n00 cpi
n00 is head node - it works and produces output.

on nodes:
route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
192.168.1.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1
0.0.0.0 192.168.0.100 0.0.0.0 UG 0 0 0 eth0

on head node:
192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
...
192.168.1.0 0.0.0.0 255.255.255.0 U 0 0 0 eth4
0.0.0.0 192.168.100.1 0.0.0.0 UG 0 0 0 eth1

node have name n01-n99
head node is n00

hosts file is like this and the same on all nodes:

127.0.0.1 localhost

192.168.0.1 n01.local n01
192.168.0.2 n02.local n02
...
192.168.0.99 n99.local n99

192.168.1.1 n01e.local n01e
192.168.1.2 n02e.local n02e
...
192.168.1.99 n99e.local n99e

/etc/host.conf:
multi on
order hosts,bind

/etc/resolv.conf:
search local
nameserver 127.0.0.1

DNS is not installed

/etc/nsswitch.conf:
...
hosts: files dns
networks: files

Thanx for help.

> I want to use for openmpi communication the additional ethernet
> interfaces on node and head node.
> its is eth1 on nodes and eth4 on head node.
> So how can I configure openmpi?
>
> If I add in config file
> btl_base_include=tcp,sm,self.
> btl_tcp_if_include=eth1
>
> will it work or not?
>
> And how is it working with torque batch system (daemons listen eth0
> on all nodes).