Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Pak Lui (Pak.Lui_at_[hidden])
Date: 2006-06-16 11:44:06


Hi Eric,

I started to see what you are saying. You tried to point out that you
are using the libdir to lib64 instead of just lib and somehow it doesn't
get picked up.

I personally have not tried this option though, so I don't think I can
help you much here. But I saw that there are changes in the rsh pls
module for the trunk and 1.1 versions (r9930, 9931, 10207, 10214) that
may solve your lib64 issue. If you do ldd on a.out, it'd show the
libraries it linked to. Other than that, setting should the
LD_LIBRARY_PATH64 shouldn't make a different either.

I am not sure if others can help you on this.

Eric Thibodeau wrote:
> Hello,
>
> I don't want to get too much off topic in this reply but you're brigning
> out a point here. I am unable to run mpi apps on the AMD64 platform with
> the regular exporting of $LD_LIBRARY_PATH and $PATH, this is why I have
> no choice but to revert to using the --prefix approach. Here are a few
> execution examples to demonstrate my point:
>
> kyron_at_headless ~ $ /usr/lib64/openmpi/1.0.2-gcc-4.1/bin/mpirun --prefix
> /usr/lib64/openmpi/1.0.2-gcc-4.1/ -np 2 ./a.out
>
> ./a.out: error while loading shared libraries: libmpi.so.0: cannot open
> shared object file: No such file or directory
>
> kyron_at_headless ~ $ /usr/lib64/openmpi/1.0.2-gcc-4.1/bin/mpirun --prefix
> /usr/lib64/openmpi/1.0.2-gcc-4.1/lib64/ -np 2 ./a.out
>
> [headless:10827] pls:rsh: execv failed with errno=2
>
> [headless:10827] ERROR: A daemon on node localhost failed to start as
> expected.
>
> [headless:10827] ERROR: There may be more information available from
>
> [headless:10827] ERROR: the remote shell (see above).
>
> [headless:10827] ERROR: The daemon exited unexpectedly with status 255.
>
> kyron_at_headless ~ $ cat opmpi64.sh
>
> #!/bin/bash
>
> MPI_BASE='/usr/lib64/openmpi/1.0.2-gcc-4.1'
>
> export PATH=$PATH:${MPI_BASE}/bin
>
> LD_LIBRARY_PATH=${MPI_BASE}/lib64
>
> kyron_at_headless ~ $ . opmpi64.sh
>
> kyron_at_headless ~ $ mpirun -np 2 ./a.out
>
> ./a.out: error while loading shared libraries: libmpi.so.0: cannot open
> shared object file: No such file or directory
>
> kyron_at_headless ~ $
>
> Eric
>
> Le vendredi 16 juin 2006 10:31, Pak Lui a écrit :
>
> > Hi, I noticed your prefix set to the lib dir, can you try without the
>
> > lib64 part and rerun?
>
> >
>
> > Eric Thibodeau wrote:
>
> > > Hello everyone,
>
> > >
>
> > > Well, first off, I hope this problem I am reporting is of some
> validity,
>
> > > I tried finding simmilar situations off Google and the mailing list
> but
>
> > > came up with only one reference [1] which seems invalid in my case
> since
>
> > > all executions are local (naïve assumptions that it makes a difference
>
> > > on the calling stack). I am trying to run asimple HelloWorld using
>
> > > OpenMPI 1.0.2 on an AMD64 machine and a Sun Enterprise (12 procs)
>
> > > machine. In both cases I get the following error:
>
> > >
>
> > > pls:rsh: execv failed with errno=2
>
> > >
>
> > > Here is the mpirun -d trace when running my HelloWorld (on AMD64):
>
> > >
>
> > > kyron_at_headless ~ $ mpirun -d --prefix
>
> > > /usr/lib64/openmpi/1.0.2-gcc-4.1/lib64/ -np 4 ./hello
>
> > >
>
> > > [headless:10461] procdir: (null)
>
> > >
>
> > > [headless:10461] jobdir: (null)
>
> > >
>
> > > [headless:10461] unidir:
>
> > > /tmp/openmpi-sessions-kyron_at_headless_0/default-universe
>
>> >
>
>> > [headless:10461] top: openmpi-sessions-kyron_at_headless_0
>
>> >
>
>> > [headless:10461] tmp: /tmp
>
>> >
>
>> > [headless:10461] [0,0,0] setting up session dir with
>
>> >
>
>> > [headless:10461] tmpdir /tmp
>
>> >
>
>> > [headless:10461] universe default-universe-10461
>
>> >
>
>> > [headless:10461] user kyron
>
>> >
>
>> > [headless:10461] host headless
>
>> >
>
>> > [headless:10461] jobid 0
>
>> >
>
>> > [headless:10461] procid 0
>
>> >
>
>> > [headless:10461] procdir:
>
>> > /tmp/openmpi-sessions-kyron_at_headless_0/default-universe-10461/0/0
>
>> >
>
>> > [headless:10461] jobdir:
>
>> > /tmp/openmpi-sessions-kyron_at_headless_0/default-universe-10461/0
>
>> >
>
>> > [headless:10461] unidir:
>
>> > /tmp/openmpi-sessions-kyron_at_headless_0/default-universe-10461
>
>> >
>
>> > [headless:10461] top: openmpi-sessions-kyron_at_headless_0
>
>> >
>
>> > [headless:10461] tmp: /tmp
>
>> >
>
>> > [headless:10461] [0,0,0] contact_file
>
>> >
> /tmp/openmpi-sessions-kyron_at_headless_0/default-universe-10461/universe-setup.txt
>
>> >
>
>> > [headless:10461] [0,0,0] wrote setup file
>
>> >
>
>> > [headless:10461] spawn: in job_state_callback(jobid = 1, state = 0x1)
>
>> >
>
>> > [headless:10461] pls:rsh: local csh: 0, local bash: 1
>
>> >
>
>> > [headless:10461] pls:rsh: assuming same remote shell as local shell
>
>> >
>
>> > [headless:10461] pls:rsh: remote csh: 0, remote bash: 1
>
>> >
>
>> > [headless:10461] pls:rsh: final template argv:
>
>> >
>
>> > [headless:10461] pls:rsh: /usr/bin/ssh <template> orted --debug
>
>> > --bootproxy 1 --name <template> --num_procs 2 --vpid_start 0 --nodename
>
>> > <template> --universe kyron_at_headless:default-universe-10461 --nsreplica
>
>> > "0.0.0;tcp://142.137.135.124:37657;tcp://192.168.1.1:37657"
> --gprreplica
>
>> > "0.0.0;tcp://142.137.135.124:37657;tcp://192.168.1.1:37657"
>
>> > --mpi-call-yield 0
>
>> >
>
>> > [headless:10461] pls:rsh: launching on node localhost
>
>> >
>
>> > [headless:10461] pls:rsh: oversubscribed -- setting mpi_yield_when_idle
>
>> > to 1 (1 4)
>
>> >
>
>> > [headless:10461] pls:rsh: localhost is a LOCAL node
>
>> >
>
>> > [headless:10461] pls:rsh: reset PATH:
>
>> >
> /usr/lib64/openmpi/1.0.2-gcc-4.1/lib64/bin:/usr/local/bin:/usr/bin:/bin:/usr/x86_64-pc-linux-gnu/gcc-bin/4.1.1:/opt/c3-4:/usr/qt/3/bin:/usr/lib64/openmpi/1.0.2-gcc-4.1/bin
>
>> >
>
>> > [headless:10461] pls:rsh: reset LD_LIBRARY_PATH:
>
>> > /usr/lib64/openmpi/1.0.2-gcc-4.1/lib64/lib
>
>> >
>
>> > [headless:10461] pls:rsh: changing to directory /home/kyron
>
>> >
>
>> > [headless:10461] pls:rsh: executing: orted --debug --bootproxy 1 --name
>
>> > 0.0.1 --num_procs 2 --vpid_start 0 --nodename localhost --universe
>
>> > kyron_at_headless:default-universe-10461 --nsreplica
>
>> > "0.0.0;tcp://142.137.135.124:37657;tcp://192.168.1.1:37657"
> --gprreplica
>
>> > "0.0.0;tcp://142.137.135.124:37657;tcp://192.168.1.1:37657"
>
>> > --mpi-call-yield 1
>
>> >
>
>> > [headless:10461] pls:rsh: execv failed with errno=2
>
>> >
>
>> > [headless:10461] sess_dir_finalize: proc session dir not empty - leaving
>
>> >
>
>> > [headless:10461] spawn: in job_state_callback(jobid = 1, state = 0xa)
>
>> >
>
>> > [headless:10461] sess_dir_finalize: proc session dir not empty - leaving
>
>> >
>
>> > [headless:10461] sess_dir_finalize: proc session dir not empty - leaving
>
>> >
>
>> > [headless:10461] sess_dir_finalize: proc session dir not empty - leaving
>
>> >
>
>> > [headless:10461] spawn: in job_state_callback(jobid = 1, state = 0x9)
>
>> >
>
>> > [headless:10461] ERROR: A daemon on node localhost failed to start as
>
>> > expected.
>
>> >
>
>> > [headless:10461] ERROR: There may be more information available from
>
>> >
>
>> > [headless:10461] ERROR: the remote shell (see above).
>
>> >
>
>> > [headless:10461] ERROR: The daemon exited unexpectedly with status 255.
>
>> >
>
>> > [headless:10461] sess_dir_finalize: found proc session dir empty -
> deleting
>
>> >
>
>> > [headless:10461] sess_dir_finalize: found job session dir empty -
> deleting
>
>> >
>
>> > [headless:10461] sess_dir_finalize: found univ session dir empty -
> deleting
>
>> >
>
>> > [headless:10461] sess_dir_finalize: top session dir not empty - leaving
>
>> >
>
>> > The two platforms are very different, one is AMD64 (dual Opteron) with
>
>> > GCC 4.1.1 (Gentoo), the other is SUN OS 5.8 with GCC 3.4.2. OpenMPI was
>
>> > compiled with the following options (extracted from the config.status):
>
>> >
>
>> > AMD64:
>
>> >
>
>> > Open MPI config.status 1.0.2
>
>> >
>
>> > configured by ./configure, generated by GNU Autoconf 2.59,
>
>> >
>
>> > with options \"'--prefix=/usr' '--host=x86_64-pc-linux-gnu'
>
>> > '--mandir=/usr/share/man' '--infodir=/usr/share/info'
>
>> > '--datadir=/usr/share' '--sysconfdir=/etc' '--localstatedir=/var/lib'
>
>> > '--prefix=/usr/lib64/openmpi/1.0.2-gcc-4.1'
>
>> > '--datadir=/usr/share/openmpi/1.0.2-gcc-4.1'
>
>> > '--program-suffix=-1.0.2-gcc-4.1'
>
>> > '--sysconfdir=/etc/openmpi/1.0.2-gcc-4.1'
>
>> > '--enable-pretty-print-stacktrace'
>
>> > '--libdir=/usr/lib64/openmpi/1.0.2-gcc-4.1/lib64'
>
>> > '--build=x86_64-pc-linux-gnu' '--cache-file' 'config.cache'
>
>> > 'CFLAGS=-march=opteron -O2 -pipe -ftracer -fprefetch-loop-arrays
>
>> > -mfpmath=sse -ffast-math -ftree-vectorize -floop-optimize2'
>
>> > 'CXXFLAGS=-march=opteron -O2 -pipe -ftracer -fprefetch-loop-arrays
>
>> > -mfpmath=sse -ffast-math -ftree-vectorize -floop-optimize2' 'LDFLAGS=
>
>> > -Wl,-z,-noexecstack' 'build_alias=x86_64-pc-linux-gnu'
>
>> > 'host_alias=x86_64-pc-linux-gnu' --enable-ltdl-convenience\"
>
>> >
>
>> > SUN 5.8:
>
>> >
>
>> > Open MPI config.status 1.0.2
>
>> >
>
>> > configured by ./configure, generated by GNU Autoconf 2.59,
>
>> >
>
>> > with options
>
>> > \"'--prefix=/export/lca/home/lca0/etudiants/ac38820/openmpi'
>
>> > '--enable-pretty-print-stacktrace' 'CFLAGS=-mv8plus'
> 'CXXFLAGS=-mv8plus'
>
>> > --enable-ltdl-convenience\"
>
>> >
>
>> > x86 (as a working reference, configure options should be close to
>
>> > identical as the AMD64):
>
>> >
>
>> > Open MPI config.status 1.0.2
>
>> >
>
>> > configured by ./configure, generated by GNU Autoconf 2.59,
>
>> >
>
>> > with options \"'--prefix=/usr' '--host=i686-pc-linux-gnu'
>
>> > '--mandir=/usr/share/man' '--infodir=/usr/share/info'
>
>> > '--datadir=/usr/share' '--sysconfdir=/etc' '--localstatedir=/var/lib'
>
>> > '--prefix=/usr/lib/openmpi/1.0.2-gcc-4.1'
>
>> > '--datadir=/usr/share/openmpi/1.0.2-gcc-4.1'
>
>> > '--program-suffix=-1.0.2-gcc-4.1'
>
>> > '--sysconfdir=/etc/openmpi/1.0.2-gcc-4.1'
>
>> > '--enable-pretty-print-stacktrace' '--build=i686-pc-linux-gnu'
>
>> > '--cache-file' 'config.cache' 'CFLAGS=-march=nocona -O2 -pipe
>
>> > -fomit-frame-pointer' 'CXXFLAGS=-march=nocona -O2 -pipe
>
>> > -fomit-frame-pointer' 'LDFLAGS= -Wl,-z,-noexecstack'
>
>> > 'build_alias=i686-pc-linux-gnu' 'host_alias=i686-pc-linux-gnu'
>
>> > --enable-ltdl-convenience\"
>
>> >
>
>> > Any help would be greatly appreciated.
>
>> >
>
>> > Thanks.
>
>> >
>
>> > [1]
> http://gridengine.sunsource.net/servlets/ReadMsg?list=users&msgNo=15775
>
>> >
>
>> > --
>
>> >
>
>> > Eric Thibodeau
>
>> >
>
>> > Neural Bucket Solutions Inc.
>
>> >
>
>> > T. (514) 736-1436
>
>> >
>
>> > C. (514) 710-0517
>
>> >
>
>> >
>
>> > ------------------------------------------------------------------------
>
>> >
>
>> > _______________________________________________
>
>> > users mailing list
>
>> > users_at_[hidden]
>
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>>
>
>>
>
> --
>
> Eric Thibodeau
>
> Neural Bucket Solutions Inc.
>
> T. (514) 736-1436
>
> C. (514) 710-0517
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Thanks,
- Pak Lui
pak.lui_at_[hidden]