Raymond Wan wrote:
>
> Hi Jeff,
>
> Some "good" news (but still some bad news). Y and Z are part of a set
> of 8 machines and I found out that mpirun works for one of them. I
> didn't checked a couple of them before -- sorry! However, I'm no closer
> to the solution since all 8 should be "identical", according to our
> sysadmin. He said the only difference (that he can think of) between
> the working one and all the others is that the working one has an NIS
> server installed. It is the NIS server for the cluster (presumably, the
> others run a client version). Could that be the reason? He can't think
> of anything else that distinguishes between them but he says it is
> possible that the NIS server is correctly configured for what we use it
> for, but not for what I'm doing with Open MPI -- he doesn't know what
> should be done, though.
In an earlier e-mail in this thread, I theorized that this might be a
problem with your name service. This latest information seems to support
that theory.
To test, on all 3 systems, use the 'host' command to see if you can
resolve the hostnames of all the 3 systems.
On host X, do this:
host X
host Y
host Z
Then do the same on hosts Y and Z.
If the 'host' command can resolve properly, you should see something
like this:
$ host foo
foo.example.com has address 192.168.1.1
If 'host' can't resolve a hostname properly, you should see something
like this:
$ host bar
Host bar not found: 3(NXDOMAIN)
OpenMPI should be using the same nameservice libraries all the other
programs use, so I find it hard to believe everything *but* OpenMPI is
working propery, but I suppose it could be possible. I've seen weirder.
--
Prentice
|