Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Relocating an Open MPI installation using OPAL_PREFIX
From: Ethan Mallove (ethan.mallove_at_[hidden])
Date: 2008-12-23 15:58:46


On Tue, Dec/23/2008 02:33:07PM, Jeff Squyres wrote:
> Yes, it works for me... :-\
>
> With initial install dir of /home/jsquyres/bogus (in my $path and
> $LD_LIBRARY_PATH already):
>
> [11:30] svbu-mpi:~/mpi % mpicc hello.c -o hello
> [11:30] svbu-mpi:~/mpi % mpirun -np 2 hello
> stdout: Hello, world! I am 0 of 2 (svbu-mpi.cisco.com)
> stdout: Hello, world! I am 1 of 2 (svbu-mpi.cisco.com)
> stderr: Hello, world! I am 0 of 2 (svbu-mpi.cisco.com)
> stderr: Hello, world! I am 1 of 2 (svbu-mpi.cisco.com)
>
> Now let's move it
>
> [11:30] svbu-mpi:~/mpi % cd
> [11:31] svbu-mpi:~ % cd /home/jsquyres/
> [11:31] svbu-mpi:/home/jsquyres % mv bogus bogus-bogus
> [11:31] svbu-mpi:/home/jsquyres % set path =
> (/home/jsquyres/bogus-bogus/bin $path)
> [11:31] svbu-mpi:/home/jsquyres % setenv LD_LIBRARY_PATH
> /home/jsquyres/bogus-bogus/lib:$LD_LIBRARY_PATH
> [11:31] svbu-mpi:/home/jsquyres % cd
>
> Confirm that it's broken:
>
> [11:31] svbu-mpi:~ % cd mpi
> [11:31] svbu-mpi:~/mpi % !mpir
> mpirun -np 2 hello
> --------------------------------------------------------------------------
> Sorry! You were supposed to get help about:
> opal_init:startup:internal-failure
> from the file:
> help-opal-runtime.txt
> But I couldn't find any file matching that name. Sorry!
> --------------------------------------------------------------------------
> [svbu-mpi.cisco.com:23042] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in
> file runtime/orte_init.c at line 77
> [svbu-mpi.cisco.com:23042] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in
> file orterun.c at line 493
>
> Now try with OPAL_PREFIX:
>
> [11:31] svbu-mpi:~/mpi % setenv OPAL_PREFIX /home/jsquyres/bogus-bogus
> [11:31] svbu-mpi:~/mpi % mpirun -np 2 hello
> stdout: Hello, world! I am 0 of 2 (svbu-mpi.cisco.com)
> stderr: Hello, world! I am 0 of 2 (svbu-mpi.cisco.com)
> stdout: Hello, world! I am 1 of 2 (svbu-mpi.cisco.com)
> stderr: Hello, world! I am 1 of 2 (svbu-mpi.cisco.com)
> [11:31] svbu-mpi:~/mpi %
>
> I don't know what you'd like from config.log -- I configured it with a
> simple:
>
> $ ./configure --prefix=/home/jsquyres/bogus
>

I think the problem is that I am doing a multi-lib build. I have
32-bit libraries in lib/, and 64-bit libraries in lib/64. I assume I
do not see the issue for 32-bit tests, because all the dependencies
are where Open MPI expects them to be. For the 64-bit case, I tried
setting OPAL_LIBDIR to /opt/openmpi-relocated/lib/lib64, but no luck.
Given the below configure arguments, what do my OPAL_* env vars need
to be? (Also, could using --enable-orterun-prefix-by-default interfere
with OPAL_PREFIX?)

    $ ./configure CC=cc CXX=CC F77=f77 FC=f90 --with-openib --without-udapl --disable-openib-ibcm --enable-heterogeneous --enable-cxx-exceptions --enable-shared --enable-orterun-prefix-by-default --with-sge --enable-mpi-f90 --with-mpi-f90-size=small --disable-mpi-threads --disable-progress-threads --disable-debug CFLAGS="-m32 -xO5" CXXFLAGS="-m32 -xO5" FFLAGS="-m32 -xO5" FCFLAGS="-m32 -xO5" --prefix=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install --mandir=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install/man --libdir=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install/lib --includedir=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install/include --without-mx --with-tm=/ws/ompi-tools/orte/torque/current/shared-install32 --with-contrib-vt-flags="--prefix=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/instal
ls/DGQx/install --mandir=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install/man --libdir=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install/lib --includedir=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install/include LDFLAGS=-R/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install/lib"

    $ ./confgiure CC=cc CXX=CC F77=f77 FC=f90 --with-openib --without-udapl --disable-openib-ibcm --enable-heterogeneous --enable-cxx-exceptions --enable-shared --enable-orterun-prefix-by-default --with-sge --enable-mpi-f90 --with-mpi-f90-size=small --disable-mpi-threads --disable-progress-threads --disable-debug CFLAGS="-m64 -xO5" CXXFLAGS="-m64 -xO5" FFLAGS="-m64 -xO5" FCFLAGS="-m64 -xO5" --prefix=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install --mandir=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install/man --libdir=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install/lib/lib64 --includedir=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install/include/64 --without-mx --with-tm=/ws/ompi-tools/orte/torque/current/shared-install64 --with-contrib-vt-flags="--prefix=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testi
ng/installs/DGQx/install --mandir=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install/man --libdir=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install/lib/lib64 --includedir=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install/include/64 LDFLAGS=-R/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install/lib" --disable-binaries

-Ethan

>
>
> On Dec 22, 2008, at 12:42 PM, Ethan Mallove wrote:
>
>> Can anyone get OPAL_PREFIX to work on Linux? A simple test is to see
>> if the following works for any mpicc/mpirun:
>>
>> $ mv <openmpi-installation> /tmp/foo
>> $ set OPAL_PREFIX /tmp/foo
>> $ mpicc ...
>> $ mpirun ...
>>
>> If you are able to get the above to run successfully, I'm interested
>> in your config.log file.
>>
>> Thanks,
>> Ethan
>>
>>
>> On Thu, Dec/18/2008 11:03:25AM, Ethan Mallove wrote:
>>> Hello,
>>>
>>> The below FAQ lists instructions on how to use a relocated Open MPI
>>> installation:
>>>
>>> http://www.open-mpi.org/faq/?category=building#installdirs
>>>
>>> On Solaris, OPAL_PREFIX and friends (documented in the FAQ) work for
>>> me with both MPI (hello_c) and non-MPI (hostname) programs. On Linux,
>>> I can only get the non-MPI case to work. Here are the environment
>>> variables I am setting:
>>>
>>> $ cat setenv_opal_prefix.csh
>>> set opal_prefix = "/opt/openmpi-relocated"
>>>
>>> setenv OPAL_PREFIX $opal_prefix
>>> setenv OPAL_BINDIR $opal_prefix/bin
>>> setenv OPAL_SBINDIR $opal_prefix/sbin
>>> setenv OPAL_DATAROOTDIR $opal_prefix/share
>>> setenv OPAL_SYSCONFDIR $opal_prefix/etc
>>> setenv OPAL_SHAREDSTATEDIR $opal_prefix/com
>>> setenv OPAL_LOCALSTATEDIR $opal_prefix/var
>>> setenv OPAL_LIBDIR $opal_prefix/lib
>>> setenv OPAL_INCLUDEDIR $opal_prefix/include
>>> setenv OPAL_INFODIR $opal_prefix/info
>>> setenv OPAL_MANDIR $opal_prefix/man
>>>
>>> setenv PATH $opal_prefix/bin:$PATH
>>> setenv LD_LIBRARY_PATH $opal_prefix/lib:$opal_prefix/lib/64
>>>
>>> Here is the error I get:
>>>
>>> $ mpirun -np 2 hello_c
>>>
>>> --------------------------------------------------------------------------
>>> It looks like opal_init failed for some reason; your parallel process is
>>> likely to abort. There are many reasons that a parallel process can
>>> fail during opal_init; some of which are due to configuration or
>>> environment problems. This failure appears to be an internal failure;
>>> here's some additional information (which may only be relevant to an
>>> Open MPI developer):
>>>
>>> opal_carto_base_select failed
>>> --> Returned value -13 instead of OPAL_SUCCESS
>>>
>>> --------------------------------------------------------------------------
>>> [burl-ct-v20z-0:27737] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in
>>> file runtime/orte_init.c at line 77
>>>
>>> Any ideas on what's going on?
>>>
>>> Thanks,
>>> Ethan
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users