Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: jody (jody.xha_at_[hidden])
Date: 2007-08-14 11:22:06


Hi TIm
thanks for the suggestions.

I now set both paths in .zshenv but it seems that LD_LIBRARY_PATH
still does not get set.
The ldd experment shows that all openmpi libraries are not found,
and indeed the printenv shows that PATH is there but LD_LIBRARY_PATH is not.

It is rather unclear why this happens...

As to thew second problem:
$ mpirun --debug-daemons -np 2 --prefix /opt/openmpi --host nano_02
./MPI2Test2
[aim-nano_02:05455] [0,0,1]-[0,0,0] mca_oob_tcp_peer_try_connect: connect to
130.60.49.134:40618 failed: Software caused connection abort (103)
[aim-nano_02:05455] [0,0,1]-[0,0,0] mca_oob_tcp_peer_try_connect: connect to
130.60.49.134:40618 failed, connecting over all interfaces failed!
[aim-nano_02:05455] OOB: Connection to HNP lost
[aim-plankton.unizh.ch:24222] [0,0,0] ORTE_ERROR_LOG: Timeout in file
base/pls_base_orted_cmds.c at line 275
[aim-plankton.unizh.ch:24222] [0,0,0] ORTE_ERROR_LOG: Timeout in file
pls_rsh_module.c at line 1164
[aim-plankton.unizh.ch:24222] [0,0,0] ORTE_ERROR_LOG: Timeout in file
errmgr_hnp.c at line 90
[aim-plankton.unizh.ch:24222] ERROR: A daemon on node nano_02 failed to
start as expected.
[aim-plankton.unizh.ch:24222] ERROR: There may be more information available
from
[aim-plankton.unizh.ch:24222] ERROR: the remote shell (see above).
[aim-plankton.unizh.ch:24222] ERROR: The daemon exited unexpectedly with
status 1.
[aim-plankton.unizh.ch:24222] [0,0,0] ORTE_ERROR_LOG: Timeout in file
base/pls_base_orted_cmds.c at line 188
[aim-plankton.unizh.ch:24222] [0,0,0] ORTE_ERROR_LOG: Timeout in file
pls_rsh_module.c at line 1196

The strange thing is that nano_02's address is 130.60.49.130 and plankton's
(the caller) is 130.60.49 134.
I also made sure that nano_02 cann ssh to plankton without password, but
that didn't change the output.

Does this message give any hints as to the problem?

Jody

On 8/14/07, Tim Prins <tprins_at_[hidden]> wrote:
>
> Hi Jody,
>
> jody wrote:
> > Hi
> > I installed openmpi 1.2.2 on a quad core intel machine running fedora 6
> > (hostname plankton)
> > I set PATH and LD_LIBRARY in the .zshrc file:
> Note that .zshrc is only used for interactive logins. You need to setup
> your system so the LD_LIBRARY_PATH and PATH is also set for
> non-interactive logins. See this zsh FAQ entry for what files you need
> to modify:
> http://zsh.sourceforge.net/FAQ/zshfaq03.html#l19
>
> (BTW: I do not use zsh, but my assumption is that the file you want to
> set the PATH and LD_LIBRARY_PATH in is .zshenv)
> > $ echo $PATH
> >
> /opt/openmpi/bin:/usr/kerberos/bin:/usr/local/bin:/usr/bin:/bin:/usr/X11R6/bin:/home/jody/bin
> >
> > $ echo $LD_LIBRARY_PATH
> > /opt/openmpi/lib:
> >
> > When i run
> > $ mpirun -np 2 ./MPITest2
> > i get the message
> > ./MPI2Test2: error while loading shared libraries: libmpi_cxx.so.0:
> > cannot open shared object file: No such file or directory
> > ./MPI2Test2: error while loading shared libraries: libmpi_cxx.so.0:
> > cannot open shared object file: No such file or directory
> >
> > However
> > $ mpirun -np 2 --prefix /opt/openmpi ./MPI2Test2
> > works. Any explanation?
> Yes, the LD_LIBRARY_PATH is probably not set correctly. Try running:
> mpirun -np 2 ldd ./MPITest2
>
> This should show what libraries your executable is using. Make sure all
> of the libraries are resolved.
>
> Also, try running:
> mpirun -np 1 printenv |grep LD_LIBRARY_PATH
> to see what the LD_LIBRARY_PATH is for you executables. Note that you
> can NOT simply run mpirun echo $LD_LIBRARY_PATH, as the variable will be
> interpreted in the executing shell.
>
> >
> > Second problem:
> > I have also installed openmpi 1.2.2 on an AMD machine running gentoo
> > linux (hostname nano_02).
> > Here as well PATH and LD_LIBRARY_PATH are set correctly,
> > and
> > $ mpirun -np 2 ./MPITest2
> > works locally on nano_02.
> >
> > If, however, from plankton i call
> > $ mpirun -np 2 --prefix /opt/openmpi --host nano_02 ./MPI2Test2
> > the call hangs with no output whatsoever.
> > Any pointers on how to solve this problem?
> Try running:
> mpirun --debug-daemons -np 2 --prefix /opt/openmpi --host nano_02
> ./MPI2Test2
>
> This should give some more output as to what is happening.
>
> Hope this helps,
>
> Tim
>
> >
> > Thank You
> > Jody
> >
> >
> >
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>