Hello,

I don't want to get too much off topic in this reply but you're brigning out a point here. I am unable to run mpi apps on the AMD64 platform with the regular exporting of $LD_LIBRARY_PATH and $PATH, this is why I have no choice but to revert to using the --prefix approach. Here are a few execution examples to demonstrate my point:

kyron@headless ~ $ /usr/lib64/openmpi/1.0.2-gcc-4.1/bin/mpirun --prefix /usr/lib64/openmpi/1.0.2-gcc-4.1/ -np 2 ./a.out

./a.out: error while loading shared libraries: libmpi.so.0: cannot open shared object file: No such file or directory

kyron@headless ~ $ /usr/lib64/openmpi/1.0.2-gcc-4.1/bin/mpirun --prefix /usr/lib64/openmpi/1.0.2-gcc-4.1/lib64/ -np 2 ./a.out

[headless:10827] pls:rsh: execv failed with errno=2

[headless:10827] ERROR: A daemon on node localhost failed to start as expected.

[headless:10827] ERROR: There may be more information available from

[headless:10827] ERROR: the remote shell (see above).

[headless:10827] ERROR: The daemon exited unexpectedly with status 255.

kyron@headless ~ $ cat opmpi64.sh

#!/bin/bash

MPI_BASE='/usr/lib64/openmpi/1.0.2-gcc-4.1'

export PATH=$PATH:${MPI_BASE}/bin

LD_LIBRARY_PATH=${MPI_BASE}/lib64

kyron@headless ~ $ . opmpi64.sh

kyron@headless ~ $ mpirun -np 2 ./a.out

./a.out: error while loading shared libraries: libmpi.so.0: cannot open shared object file: No such file or directory

kyron@headless ~ $

Eric

Le vendredi 16 juin 2006 10:31, Pak Lui a écrit :

> Hi, I noticed your prefix set to the lib dir, can you try without the

> lib64 part and rerun?

>

> Eric Thibodeau wrote:

> > Hello everyone,

> >

> > Well, first off, I hope this problem I am reporting is of some validity,

> > I tried finding simmilar situations off Google and the mailing list but

> > came up with only one reference [1] which seems invalid in my case since

> > all executions are local (naïve assumptions that it makes a difference

> > on the calling stack). I am trying to run asimple HelloWorld using

> > OpenMPI 1.0.2 on an AMD64 machine and a Sun Enterprise (12 procs)

> > machine. In both cases I get the following error:

> >

> > pls:rsh: execv failed with errno=2

> >

> > Here is the mpirun -d trace when running my HelloWorld (on AMD64):

> >

> > kyron@headless ~ $ mpirun -d --prefix

> > /usr/lib64/openmpi/1.0.2-gcc-4.1/lib64/ -np 4 ./hello

> >

> > [headless:10461] procdir: (null)

> >

> > [headless:10461] jobdir: (null)

> >

> > [headless:10461] unidir:

> > /tmp/openmpi-sessions-kyron@headless_0/default-universe

> >

> > [headless:10461] top: openmpi-sessions-kyron@headless_0

> >

> > [headless:10461] tmp: /tmp

> >

> > [headless:10461] [0,0,0] setting up session dir with

> >

> > [headless:10461] tmpdir /tmp

> >

> > [headless:10461] universe default-universe-10461

> >

> > [headless:10461] user kyron

> >

> > [headless:10461] host headless

> >

> > [headless:10461] jobid 0

> >

> > [headless:10461] procid 0

> >

> > [headless:10461] procdir:

> > /tmp/openmpi-sessions-kyron@headless_0/default-universe-10461/0/0

> >

> > [headless:10461] jobdir:

> > /tmp/openmpi-sessions-kyron@headless_0/default-universe-10461/0

> >

> > [headless:10461] unidir:

> > /tmp/openmpi-sessions-kyron@headless_0/default-universe-10461

> >

> > [headless:10461] top: openmpi-sessions-kyron@headless_0

> >

> > [headless:10461] tmp: /tmp

> >

> > [headless:10461] [0,0,0] contact_file

> > /tmp/openmpi-sessions-kyron@headless_0/default-universe-10461/universe-setup.txt

> >

> > [headless:10461] [0,0,0] wrote setup file

> >

> > [headless:10461] spawn: in job_state_callback(jobid = 1, state = 0x1)

> >

> > [headless:10461] pls:rsh: local csh: 0, local bash: 1

> >

> > [headless:10461] pls:rsh: assuming same remote shell as local shell

> >

> > [headless:10461] pls:rsh: remote csh: 0, remote bash: 1

> >

> > [headless:10461] pls:rsh: final template argv:

> >

> > [headless:10461] pls:rsh: /usr/bin/ssh <template> orted --debug

> > --bootproxy 1 --name <template> --num_procs 2 --vpid_start 0 --nodename

> > <template> --universe kyron@headless:default-universe-10461 --nsreplica

> > "0.0.0;tcp://142.137.135.124:37657;tcp://192.168.1.1:37657" --gprreplica

> > "0.0.0;tcp://142.137.135.124:37657;tcp://192.168.1.1:37657"

> > --mpi-call-yield 0

> >

> > [headless:10461] pls:rsh: launching on node localhost

> >

> > [headless:10461] pls:rsh: oversubscribed -- setting mpi_yield_when_idle

> > to 1 (1 4)

> >

> > [headless:10461] pls:rsh: localhost is a LOCAL node

> >

> > [headless:10461] pls:rsh: reset PATH:

> > /usr/lib64/openmpi/1.0.2-gcc-4.1/lib64/bin:/usr/local/bin:/usr/bin:/bin:/usr/x86_64-pc-linux-gnu/gcc-bin/4.1.1:/opt/c3-4:/usr/qt/3/bin:/usr/lib64/openmpi/1.0.2-gcc-4.1/bin

> >

> > [headless:10461] pls:rsh: reset LD_LIBRARY_PATH:

> > /usr/lib64/openmpi/1.0.2-gcc-4.1/lib64/lib

> >

> > [headless:10461] pls:rsh: changing to directory /home/kyron

> >

> > [headless:10461] pls:rsh: executing: orted --debug --bootproxy 1 --name

> > 0.0.1 --num_procs 2 --vpid_start 0 --nodename localhost --universe

> > kyron@headless:default-universe-10461 --nsreplica

> > "0.0.0;tcp://142.137.135.124:37657;tcp://192.168.1.1:37657" --gprreplica

> > "0.0.0;tcp://142.137.135.124:37657;tcp://192.168.1.1:37657"

> > --mpi-call-yield 1

> >

> > [headless:10461] pls:rsh: execv failed with errno=2

> >

> > [headless:10461] sess_dir_finalize: proc session dir not empty - leaving

> >

> > [headless:10461] spawn: in job_state_callback(jobid = 1, state = 0xa)

> >

> > [headless:10461] sess_dir_finalize: proc session dir not empty - leaving

> >

> > [headless:10461] sess_dir_finalize: proc session dir not empty - leaving

> >

> > [headless:10461] sess_dir_finalize: proc session dir not empty - leaving

> >

> > [headless:10461] spawn: in job_state_callback(jobid = 1, state = 0x9)

> >

> > [headless:10461] ERROR: A daemon on node localhost failed to start as

> > expected.

> >

> > [headless:10461] ERROR: There may be more information available from

> >

> > [headless:10461] ERROR: the remote shell (see above).

> >

> > [headless:10461] ERROR: The daemon exited unexpectedly with status 255.

> >

> > [headless:10461] sess_dir_finalize: found proc session dir empty - deleting

> >

> > [headless:10461] sess_dir_finalize: found job session dir empty - deleting

> >

> > [headless:10461] sess_dir_finalize: found univ session dir empty - deleting

> >

> > [headless:10461] sess_dir_finalize: top session dir not empty - leaving

> >

> > The two platforms are very different, one is AMD64 (dual Opteron) with

> > GCC 4.1.1 (Gentoo), the other is SUN OS 5.8 with GCC 3.4.2. OpenMPI was

> > compiled with the following options (extracted from the config.status):

> >

> > AMD64:

> >

> > Open MPI config.status 1.0.2

> >

> > configured by ./configure, generated by GNU Autoconf 2.59,

> >

> > with options \"'--prefix=/usr' '--host=x86_64-pc-linux-gnu'

> > '--mandir=/usr/share/man' '--infodir=/usr/share/info'

> > '--datadir=/usr/share' '--sysconfdir=/etc' '--localstatedir=/var/lib'

> > '--prefix=/usr/lib64/openmpi/1.0.2-gcc-4.1'

> > '--datadir=/usr/share/openmpi/1.0.2-gcc-4.1'

> > '--program-suffix=-1.0.2-gcc-4.1'

> > '--sysconfdir=/etc/openmpi/1.0.2-gcc-4.1'

> > '--enable-pretty-print-stacktrace'

> > '--libdir=/usr/lib64/openmpi/1.0.2-gcc-4.1/lib64'

> > '--build=x86_64-pc-linux-gnu' '--cache-file' 'config.cache'

> > 'CFLAGS=-march=opteron -O2 -pipe -ftracer -fprefetch-loop-arrays

> > -mfpmath=sse -ffast-math -ftree-vectorize -floop-optimize2'

> > 'CXXFLAGS=-march=opteron -O2 -pipe -ftracer -fprefetch-loop-arrays

> > -mfpmath=sse -ffast-math -ftree-vectorize -floop-optimize2' 'LDFLAGS=

> > -Wl,-z,-noexecstack' 'build_alias=x86_64-pc-linux-gnu'

> > 'host_alias=x86_64-pc-linux-gnu' --enable-ltdl-convenience\"

> >

> > SUN 5.8:

> >

> > Open MPI config.status 1.0.2

> >

> > configured by ./configure, generated by GNU Autoconf 2.59,

> >

> > with options

> > \"'--prefix=/export/lca/home/lca0/etudiants/ac38820/openmpi'

> > '--enable-pretty-print-stacktrace' 'CFLAGS=-mv8plus' 'CXXFLAGS=-mv8plus'

> > --enable-ltdl-convenience\"

> >

> > x86 (as a working reference, configure options should be close to

> > identical as the AMD64):

> >

> > Open MPI config.status 1.0.2

> >

> > configured by ./configure, generated by GNU Autoconf 2.59,

> >

> > with options \"'--prefix=/usr' '--host=i686-pc-linux-gnu'

> > '--mandir=/usr/share/man' '--infodir=/usr/share/info'

> > '--datadir=/usr/share' '--sysconfdir=/etc' '--localstatedir=/var/lib'

> > '--prefix=/usr/lib/openmpi/1.0.2-gcc-4.1'

> > '--datadir=/usr/share/openmpi/1.0.2-gcc-4.1'

> > '--program-suffix=-1.0.2-gcc-4.1'

> > '--sysconfdir=/etc/openmpi/1.0.2-gcc-4.1'

> > '--enable-pretty-print-stacktrace' '--build=i686-pc-linux-gnu'

> > '--cache-file' 'config.cache' 'CFLAGS=-march=nocona -O2 -pipe

> > -fomit-frame-pointer' 'CXXFLAGS=-march=nocona -O2 -pipe

> > -fomit-frame-pointer' 'LDFLAGS= -Wl,-z,-noexecstack'

> > 'build_alias=i686-pc-linux-gnu' 'host_alias=i686-pc-linux-gnu'

> > --enable-ltdl-convenience\"

> >

> > Any help would be greatly appreciated.

> >

> > Thanks.

> >

> > [1] http://gridengine.sunsource.net/servlets/ReadMsg?list=users&msgNo=15775

> >

> > --

> >

> > Eric Thibodeau

> >

> > Neural Bucket Solutions Inc.

> >

> > T. (514) 736-1436

> >

> > C. (514) 710-0517

> >

> >

> > ------------------------------------------------------------------------

> >

> > _______________________________________________

> > users mailing list

> > users@open-mpi.org

> > http://www.open-mpi.org/mailman/listinfo.cgi/users

>

>

--

Eric Thibodeau

Neural Bucket Solutions Inc.

T. (514) 736-1436

C. (514) 710-0517