Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] hostname not known only in fedora 15
From: Bernhard Knapp (bernhard.knapp_at_[hidden])
Date: 2012-04-19 10:27:21


dig returns the following:

[terminal output start]

[bknapp_at_quoVadis27 ~]$ dig quoVadis27

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.2.rc1.fc15 <<>> quoVadis27
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 57978
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;quoVadis27. IN A

;; Query time: 2 msec
;; SERVER: 192.168.0.1#53(192.168.0.1)
;; WHEN: Thu Apr 19 17:16:42 2012
;; MSG SIZE rcvd: 28

[terminal output end]

If I call "dig" on a different maschine (with fedora core 12 instead of fc 15 but the same network setup and openmpi-install) then I get the following:

  [terminal output start]

[bknapp_at_quoVadis20 ~]$ dig quoVadis20

; <<>> DiG 9.6.1-P1-RedHat-9.6.1-11.P1.fc12 <<>> quoVadis20
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 62282
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;quoVadis20. IN A

;; Query time: 2 msec
;; SERVER: 192.168.0.1#53(192.168.0.1)
;; WHEN: Thu Apr 19 17:58:02 2012
;; MSG SIZE rcvd: 28

  [terminal output end]

In the case of quoVadis20 open-mpi runs without any problems in the case of quoVadis27 it comes up with the error given in the first email...

Best,
Bernhard

-------- original message --------

*Subject:* Re: [OMPI users] hostname not known only in fedora 15
*From:* Jeffrey Squyres (/jsquyres_at_[hidden]/)
*Date:* 2012-04-19 09:21:35

What happens if you "dig quoVadis27"?

If you don't get a valid answer back, then it's not a resolvable name.

On Apr 19, 2012, at 6:42 AM, Bernhard Knapp wrote:

> Dear mail-list users,
>
> I have a problem when I try to run a parallel gromacs job on fedora core 15. The same job (same installation options and network-setup) for fedora core 13 works fine. I already tried it in a fedora forum but I could not find a solution there ...
>
>
> [terminal output start]
>
> [name_at_quoVadis27 folder]$ mpirun -np 4 mdrun [...] : Could not resolve hostname quoVadis27: Name or service not known
> --------------------------------------------------------------------------
> A daemon (pid 9722) died unexpectedly with status 255 while attempting
> to launch so we are aborting.
>
> There may be more information reported by the environment (see above).
>
> This may be because the daemon was unable to find all the needed shared
> libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
> location of the shared libraries on the remote nodes and this will
> automatically be forwarded to the remote nodes.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> mpirun noticed that the job aborted, but has no info as to the process
> that caused that situation.
> --------------------------------------------------------------------------
> mpirun: clean termination accomplished
>
> [terminal output end]
>
>
>
> It claims that "quoVadis27" is not known however this is just the name of the maschine itself:
>
> [terminal output start]
>
> [name_at_quoVadis27 ~]$ hostname
> quoVadis27
>
> [name_at_quoVadis27 ~]$ cat /etc/resolv.conf
> # Generated by NetworkManager
> nameserver 192.168.0.1
>
> [name_at_quoVadis27 ~]$ cat /etc/hosts
> 127.0.0.1 localhost.localdomain localhost
> ::1 localhost6.localdomain6 localhost6
>
> [terminal output end]
>
>
> Also the LD_LIBRARY_PATH is set in the bash.rc: export LD_LIBRARY_PATH="/usr/local/lib" .
>
> Any ideas how to solve this problem?
>
> best,
> Bernhard