Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Relocating an Open MPI installation using OPAL_PREFIX
From: Ethan Mallove (ethan.mallove_at_[hidden])
Date: 2009-01-06 16:36:12


On Tue, Jan/06/2009 10:33:31AM, Ethan Mallove wrote:
> On Mon, Jan/05/2009 10:14:30PM, Brian Barrett wrote:
> > Sorry I haven't jumped in this thread earlier -- I've been a bit behind.
> >
> > The multi-lib support worked at one time, and I can't think of why it would
> > have changed. The one condition is that libdir, includedir, etc. *MUST* be
> > specified relative to $prefix for it to work. It looks like you were
> > defining them as absolute paths, so you'd have to set libdir directly,
> > which will never work in multi-lib because mpirun and the app likely have
> > different word sizes and therefore different libdirs.
> >
>
> I see. I'll try configuring with relative paths using ${prefix} and
> the like.
>
> > More information is on the multilib page in the wiki:
> >
> > https://svn.open-mpi.org/trac/ompi/wiki/MultiLib
> >
>
> I removed this line from the MultiLib wiki page since Open MPI *is*
> now relocatable using the OPAL_PREFIX env vars:
>
> "Presently, Open MPI is not relocatable. That is, Open MPI *must*
> be installed and executed from which ever prefix was specified
> during configure. This is planned to change in the very near
> future."
>
> Thanks,
> Ethan
>
>
> > There is actually one condition we do not handle properly, the prefix flag
> > to mpirun. The LD_LIBRARY_PATH will only be set for the word size of
> > mpirun, and not the executable. Really, both would have to be added (so
> > that both orted, which is likely always 32 bit in a multilib situation and
> > the app both find their libraries).
> >
> > Brian
> >
> > On Jan 5, 2009, at 6:02 PM, Jeff Squyres wrote:
> >
> >> I honestly haven't thought through the ramifications of doing a multi-lib
> >> build with OPAL_PREFIX et al. :-\
> >>
> >> If you setenv OPAL_LIBDIR, it'll use whatever you set it to, so it doesn't
> >> matter what you configured --libdir with. Additionally
> >> mca/installdirs/config/install_dirs.h has this by default:
> >>
> >> #define OPAL_LIBDIR "${exec_prefix}/lib"
> >>
> >> Hence, if you use a default --libdir and setenv OPAL_PREFIX, then the
> >> libdir should pick up the right thing (because it's based on the prefix).
> >> But if you use --libdir that is *not* based on ${exec_prefix}, then you
> >> might run into problems.
> >>
> >> Perhaps you can '--libdir="${exec_prefix}/lib64"' so that you can have
> >> your custom libdir, but still have it dependent upon the prefix that gets
> >> expanded at run time...?

Can the Open MPI configure setup handle ${exec_prefix} at the command
line? ${exec_prefix} seems to be getting eval'd to "NONE" in the
sub-configure's, and I get the following error:

  ...
  *** GNU libltdl setup
  configure: OMPI configuring in opal/libltdl
  configure: running /bin/bash './configure' 'CC=cc' 'CXX=CC' 'F77=f77' 'FC=f90' '--without-threads' '--enable-heterogeneous' '--enable-cxx-exceptions' '--enable-shared' '--enable-orterun-prefix-by-default' '--with-sge' '--enable-mpi-f90' '--with-mpi-f90-size=small' '--disable-mpi-threads' '--disable-progress-threads' '--disable-debug' 'CFLAGS=-xtarget=ultra3 -m32 -xarch=sparcvis2 -xprefetch -xprefetch_level=2 -xvector=lib -xdepend=yes -xbuiltin=%all -xO5' 'CXXFLAGS=-xtarget=ultra3 -m32 -xarch=sparcvis2 -xprefetch -xprefetch_level=2 -xvector=lib -xdepend=yes -xbuiltin=%all -xO5' 'FFLAGS=-xtarget=ultra3 -m32 -xarch=sparcvis2 -xprefetch -xprefetch_level=2 -xvector=lib -stackvar -xO5' 'FCFLAGS=-xtarget=ultra3 -m32 -xarch=sparcvis2 -xprefetch -xprefetch_level=2 -xvector=lib -stackvar -xO5' '--prefix=/opt/SUNWhpc/HPC8.2/sun' '--libdir=NONE/lib' '--includedir=/opt/SUNWhpc/HPC8.2/sun/include' '--without-mx' '--with-tm=/ws/ompi-tools/orte/torque/current/shared-install32' '--with-contrib-vt-flags=--prefix=/opt/SUNWh
pc/HPC8.2/sun --libdir='/lib' --includedir='/include' LDFLAGS=-R/opt/SUNWhpc/HPC8.2/sun/lib' '--with-package-string=ClusterTools 8.2' '--with-ident-string=@(#)RELEASE VERSION 1.3r20204-ct8.2-b01b-r10' --enable-ltdl-convenience --disable-ltdl-install --enable-shared --disable-static --cache-file=/dev/null --srcdir=. configure: WARNING: Unrecognized options: --without-threads, --enable-heterogeneous, --enable-cxx-exceptions, --enable-orterun-prefix-by-default, --with-sge, --enable-mpi-f90, --with-mpi-f90-size, --disable-mpi-threads, --disable-progress-threads, --disable-debug, --without-mx, --with-tm, --with-contrib-vt-flags, --with-package-string, --with-ident-string, --enable-ltdl-convenience
  configure: error: expected an absolute directory name for --libdir: NONE/lib
  configure: /bin/bash './configure' *failed* for opal/libltdl
  configure: error: Failed to build GNU libltdl. This usually means that something
  is incorrectly setup with your environment. There may be useful information in
  opal/libltdl/config.log. You can also disable GNU libltdl (which will disable
  dynamic shared object loading) by configuring with --disable-dlopen.

It appears the sub-configure needs to escape some "$" chars.
(opal/libltdl is the portable dlopen() stuff, right? That is, it's not
an optional feature that we can temporarily turn off to work around
this issue.)

-Ethan

> >>
> >> (again, I'm not thinking all of this through -- just offering a few
> >> suggestions off the top of my head that you'll need to test / trace the
> >> code to be sure...)
> >>
> >>
> >> On Jan 5, 2009, at 1:35 PM, Ethan Mallove wrote:
> >>
> >>> On Thu, Dec/25/2008 08:12:49AM, Jeff Squyres wrote:
> >>>> It's quite possible that we don't handle this situation properly. Won't
> >>>> you need to libdir's (one for the 32 bit OMPI executables, and one for
> >>>> the
> >>>> 64 bit MPI apps)?
> >>>
> >>> I don't need an OPAL environment variable for the executables, just a
> >>> single OPAL_LIBDIR var for the libraries. (One set of 32-bit
> >>> executables runs with both 32-bit and 64-bit libraries.) I'm guessing
> >>> OPAL_LIBDIR will not work for you if you configure with a non-standard
> >>> --libdir option.
> >>>
> >>> -Ethan
> >>>
> >>>
> >>>>
> >>>> On Dec 23, 2008, at 3:58 PM, Ethan Mallove wrote:
> >>>>
> >>>>> I think the problem is that I am doing a multi-lib build. I have
> >>>>> 32-bit libraries in lib/, and 64-bit libraries in lib/64. I assume I
> >>>>> do not see the issue for 32-bit tests, because all the dependencies
> >>>>> are where Open MPI expects them to be. For the 64-bit case, I tried
> >>>>> setting OPAL_LIBDIR to /opt/openmpi-relocated/lib/lib64, but no luck.
> >>>>> Given the below configure arguments, what do my OPAL_* env vars need
> >>>>> to be? (Also, could using --enable-orterun-prefix-by-default interfere
> >>>>> with OPAL_PREFIX?)
> >>>>>
> >>>>> $ ./configure CC=cc CXX=CC F77=f77 FC=f90 --with-openib
> >>>>> --without-udapl --disable-openib-ibcm --enable-heterogeneous
> >>>>> --enable-cxx-exceptions --enable-shared
> >>>>> --enable-orterun-prefix-by-default
> >>>>> --with-sge --enable-mpi-f90 --with-mpi-f90-size=small
> >>>>> --disable-mpi-threads --disable-progress-threads --disable-debug
> >>>>> CFLAGS="-m32 -xO5" CXXFLAGS="-m32 -xO5" FFLAGS="-m32 -xO5"
> >>>>> FCFLAGS="-m32
> >>>>> -xO5"
> >>>>> --prefix=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install
> >>>>> --mandir=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install/man
> >>>>> --libdir=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install/lib
> >>>>> --includedir=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install/include
> >>>>> --without-mx
> >>>>> --with-tm=/ws/ompi-tools/orte/torque/current/shared-install32
> >>>>> --with-contrib-vt-flags="--prefix=/workspace/em162155/hpc/mtt-scratch/burl-ct-v!
> >>>>> 20z-12/ompi-tarball-testing/installs/DGQx/install
> >>>>> --mandir=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install/man
> >>>>> --libdir=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install/lib
> >>>>> --includedir=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install/include
> >>>>> LDFLAGS=-R/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install/lib"
> >>>>>
> >>>>> $ ./confgiure CC=cc CXX=CC F77=f77 FC=f90 --with-openib
> >>>>> --without-udapl --disable-openib-ibcm --enable-heterogeneous
> >>>>> --enable-cxx-exceptions --enable-shared
> >>>>> --enable-orterun-prefix-by-default
> >>>>> --with-sge --enable-mpi-f90 --with-mpi-f90-size=small
> >>>>> --disable-mpi-threads --disable-progress-threads --disable-debug
> >>>>> CFLAGS="-m64 -xO5" CXXFLAGS="-m64 -xO5" FFLAGS="-m64 -xO5"
> >>>>> FCFLAGS="-m64
> >>>>> -xO5"
> >>>>> --prefix=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install
> >>>>> --mandir=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install/man
> >>>>> --libdir=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install/lib/lib64
> >>>>> --includedir=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install/include/64
> >>>>> --without-mx
> >>>>> --with-tm=/ws/ompi-tools/orte/torque/current/shared-install64
> >>>>> --with-contrib-vt-flags="--prefix=/workspace/em162155/hpc/mtt-scratch/!
> >>>>> burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install
> >>>>> --mandir=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install/man
> >>>>> --libdir=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install/lib/lib64
> >>>>> --includedir=/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install/include/64
> >>>>> LDFLAGS=-R/workspace/em162155/hpc/mtt-scratch/burl-ct-v20z-12/ompi-tarball-testing/installs/DGQx/install/lib"
> >>>>> --disable-binaries
> >>>>>
> >>>>> -Ethan
> >>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Dec 22, 2008, at 12:42 PM, Ethan Mallove wrote:
> >>>>>>
> >>>>>>> Can anyone get OPAL_PREFIX to work on Linux? A simple test is to see
> >>>>>>> if the following works for any mpicc/mpirun:
> >>>>>>>
> >>>>>>> $ mv <openmpi-installation> /tmp/foo
> >>>>>>> $ set OPAL_PREFIX /tmp/foo
> >>>>>>> $ mpicc ...
> >>>>>>> $ mpirun ...
> >>>>>>>
> >>>>>>> If you are able to get the above to run successfully, I'm interested
> >>>>>>> in your config.log file.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Ethan
> >>>>>>>
> >>>>>>>
> >>>>>>> On Thu, Dec/18/2008 11:03:25AM, Ethan Mallove wrote:
> >>>>>>>> Hello,
> >>>>>>>>
> >>>>>>>> The below FAQ lists instructions on how to use a relocated Open MPI
> >>>>>>>> installation:
> >>>>>>>>
> >>>>>>>> http://www.open-mpi.org/faq/?category=building#installdirs
> >>>>>>>>
> >>>>>>>> On Solaris, OPAL_PREFIX and friends (documented in the FAQ) work for
> >>>>>>>> me with both MPI (hello_c) and non-MPI (hostname) programs. On
> >>>>>>>> Linux,
> >>>>>>>> I can only get the non-MPI case to work. Here are the environment
> >>>>>>>> variables I am setting:
> >>>>>>>>
> >>>>>>>> $ cat setenv_opal_prefix.csh
> >>>>>>>> set opal_prefix = "/opt/openmpi-relocated"
> >>>>>>>>
> >>>>>>>> setenv OPAL_PREFIX $opal_prefix
> >>>>>>>> setenv OPAL_BINDIR $opal_prefix/bin
> >>>>>>>> setenv OPAL_SBINDIR $opal_prefix/sbin
> >>>>>>>> setenv OPAL_DATAROOTDIR $opal_prefix/share
> >>>>>>>> setenv OPAL_SYSCONFDIR $opal_prefix/etc
> >>>>>>>> setenv OPAL_SHAREDSTATEDIR $opal_prefix/com
> >>>>>>>> setenv OPAL_LOCALSTATEDIR $opal_prefix/var
> >>>>>>>> setenv OPAL_LIBDIR $opal_prefix/lib
> >>>>>>>> setenv OPAL_INCLUDEDIR $opal_prefix/include
> >>>>>>>> setenv OPAL_INFODIR $opal_prefix/info
> >>>>>>>> setenv OPAL_MANDIR $opal_prefix/man
> >>>>>>>>
> >>>>>>>> setenv PATH $opal_prefix/bin:$PATH
> >>>>>>>> setenv LD_LIBRARY_PATH $opal_prefix/lib:$opal_prefix/lib/64
> >>>>>>>>
> >>>>>>>> Here is the error I get:
> >>>>>>>>
> >>>>>>>> $ mpirun -np 2 hello_c
> >>>>>>>>
> >>>>>>>> --------------------------------------------------------------------------
> >>>>>>>> It looks like opal_init failed for some reason; your parallel
> >>>>>>>> process
> >>>>>>>> is
> >>>>>>>> likely to abort. There are many reasons that a parallel process can
> >>>>>>>> fail during opal_init; some of which are due to configuration or
> >>>>>>>> environment problems. This failure appears to be an internal
> >>>>>>>> failure;
> >>>>>>>> here's some additional information (which may only be relevant to an
> >>>>>>>> Open MPI developer):
> >>>>>>>>
> >>>>>>>> opal_carto_base_select failed
> >>>>>>>> --> Returned value -13 instead of OPAL_SUCCESS
> >>>>>>>>
> >>>>>>>> --------------------------------------------------------------------------
> >>>>>>>> [burl-ct-v20z-0:27737] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found
> >>>>>>>> in
> >>>>>>>> file runtime/orte_init.c at line 77
> >>>>>>>>
> >>>>>>>> Any ideas on what's going on?
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Ethan
> >>>>>>> _______________________________________________
> >>>>>>> users mailing list
> >>>>>>> users_at_[hidden]
> >>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Jeff Squyres
> >>>>>> Cisco Systems
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> users mailing list
> >>>>>> users_at_[hidden]
> >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>>
> >>>>> _______________________________________________
> >>>>> users mailing list
> >>>>> users_at_[hidden]
> >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>
> >>>>
> >>>> --
> >>>> Jeff Squyres
> >>>> Cisco Systems
> >>>>
> >>>> _______________________________________________
> >>>> users mailing list
> >>>> users_at_[hidden]
> >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>> _______________________________________________
> >>> users mailing list
> >>> users_at_[hidden]
> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >>
> >> --
> >> Jeff Squyres
> >> Cisco Systems
> >>
> >> _______________________________________________
> >> users mailing list
> >> users_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users