Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Bill Johnstone (beejstone3_at_[hidden])
Date: 2007-07-17 14:15:17

Thanks for the help. I've replied below.

--- "G.O." <gurhan.ozen_at_[hidden]> wrote:

> 1- Check to make sure that there are no firewalls blocking
> traffic between the nodes.

There is no firewall in-between the nodes. If I run jobs directly via
ssh, e.g. "ssh node4 env" they work.

> 2 - Check to make sure that all nodes have the openmpi installed
> and have the very same executable you are trying to run on the same
> path, have all permissions correctly.

Yes, they are all installed to /usr/local , the permissions are the
same, and if I just invoke mpirun on an individual node by logging into
it, it works. In fact, even commands like "ssh node4 mpirun" (just to
get the mpirun help banner) work.

> 3- Check to make sure that all nodes have the same interface,
> i.e. eth0 .

They all do have the same interfaces. In my configureation, eth1 is
the interface that corresponds to the cluster IP network. I have tried
using "--mca btl_tcp_if_include eth1" but it seems to make no

> That's all i can think of for very quick checks for now. Hope it's
> one of this.

Thank you very much, but unfortunately it isn't any of these, as far as
I can tell.

Fussy? Opinionated? Impossible to please? Perfect. Join Yahoo!'s user panel and lay it on us.