dig returns the following:
[terminal output start]
[bknapp@quoVadis27 ~]$ dig quoVadis27
; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.2.rc1.fc15
<<>> quoVadis27
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id:
57978
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;quoVadis27. IN A
;; Query time: 2 msec
;; SERVER: 192.168.0.1#53(192.168.0.1)
;; WHEN: Thu Apr 19 17:16:42 2012
;; MSG SIZE rcvd: 28
[terminal output end]
If I call "dig" on a different maschine (with fedora core 12 instead
of fc 15 but the same network setup and openmpi-install) then I get
the following:
[terminal output start]
[bknapp@quoVadis20 ~]$ dig quoVadis20
; <<>> DiG 9.6.1-P1-RedHat-9.6.1-11.P1.fc12
<<>> quoVadis20
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id:
62282
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;quoVadis20. IN A
;; Query time: 2 msec
;; SERVER: 192.168.0.1#53(192.168.0.1)
;; WHEN: Thu Apr 19 17:58:02 2012
;; MSG SIZE rcvd: 28
[terminal output end]
In the case of quoVadis20 open-mpi runs without any problems in the
case of quoVadis27 it comes up with the error given in the first
email...
Best,
Bernhard
-------- original message --------
Subject: Re: [OMPI users] hostname not known only
in fedora 15
From: Jeffrey Squyres (jsquyres_at_[hidden])
Date: 2012-04-19 09:21:35
What happens if you "dig quoVadis27"?
If you don't get a valid answer back, then it's not a resolvable
name.
On Apr 19, 2012, at 6:42 AM, Bernhard Knapp wrote:
> Dear mail-list users,
>
> I have a problem when I try to run a
parallel gromacs job on fedora core 15. The same job (same
installation options and network-setup) for fedora core 13 works
fine. I already tried it in a fedora forum but I could not find a
solution there ...
>
>
> [terminal output start]
>
> [name_at_quoVadis27 folder]$ mpirun -np
4 mdrun [...] : Could not resolve hostname quoVadis27: Name or
service not known
>
--------------------------------------------------------------------------
> A daemon (pid 9722) died unexpectedly
with status 255 while attempting
> to launch so we are aborting.
>
> There may be more information reported
by the environment (see above).
>
> This may be because the daemon was
unable to find all the needed shared
> libraries on the remote node. You may
set your LD_LIBRARY_PATH to have the
> location of the shared libraries on the
remote nodes and this will
> automatically be forwarded to the
remote nodes.
>
--------------------------------------------------------------------------
>
--------------------------------------------------------------------------
> mpirun noticed that the job aborted,
but has no info as to the process
> that caused that situation.
>
--------------------------------------------------------------------------
> mpirun: clean termination accomplished
>
> [terminal output end]
>
>
>
> It claims that "quoVadis27" is not
known however this is just the name of the maschine itself:
>
> [terminal output start]
>
> [name_at_quoVadis27 ~]$ hostname
> quoVadis27
>
> [name_at_quoVadis27 ~]$ cat
/etc/resolv.conf
> # Generated by NetworkManager
> nameserver 192.168.0.1
>
> [name_at_quoVadis27 ~]$ cat /etc/hosts
> 127.0.0.1 localhost.localdomain
localhost
> ::1 localhost6.localdomain6 localhost6
>
> [terminal output end]
>
>
> Also the LD_LIBRARY_PATH is set in the
bash.rc: export LD_LIBRARY_PATH="/usr/local/lib" .
>
> Any ideas how to solve this problem?
>
> best,
> Bernhard