Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] Problems in 1.3 loading shared libs when using VampirServer
From: Kiril Dichev (dichev_at_[hidden])
Date: 2009-02-04 12:03:18


Hi guys,

sorry for the long e-mail.

I have been trying for some time now to run VampirServer with shared
libs for Open MPI 1.3.

First of all: The "--enable-static --disable-shared" version works.
Also, the 1.2 series worked fine with the shared libs.

But here is the story for the shared libraries with OMPI 1.3:
Compilation of OMPI went fine and also the VampirServer guys compiled
the MPI driver they need against OMPI. The driver just refers to the
shared libraries of Open MPI.

However, on launching the server I got errors of the type "undefined
symbol":

error: /home_nfs/parma/x86_64/UNITE/packages/openmpi/1.3-intel10.1-64bit-MT-shared/lib/openmpi/mca_paffinity_linux.so:
undefined symbol: mca_base_param_reg_int

It seemed to me that probably my LD_LIBRARY_PATH is not including
<MPI_INSTALL>/lib/openmpi , but I exported it and did "mpirun -x
LD_LIBRARY_PATH ..." and nothing changed.

Then, I started building any component complaining with "undefined
symbol" with "--enable-mca-static" - for example the above message
disappeared after I did --enable-mca-static paffinity. I don't know why
this worked, but it seemed to help. However, it was always replaced by
another error message of another component.

After a few components another error came

mca: base: component_find: unable to
open /home_nfs/parma/x86_64/UNITE/packages/openmpi/1.3-intel10.1-64bit-MT-shared/lib/openmpi/mca_rml_oob: file not found (ignored)

(full output attached)

Now, I was unsure what to do, but again, when compiling the complaining
component statically, things went a step further. One thing that struck
me is that there is such a file with an extra ".so" at the end in the
directory -but maybe dlopen also accepts files without the ".so", I
don't know.

Anywas, now I have included like 20 components statically and still
build shared objects for the OMPI libs and things seem to work.

Does anyone have any idea why these dozens of errors happen when loading
shared libs? Like I said, I never had this in 1.2 series.

Thanks,
Kiril


[nv8:21349] mca: base: component_find: unable to open /home_nfs/parma/x86_64/UNITE/packages/openmpi/1.3-intel10.1-64bit-MT-shared/lib/openmpi/mca_rml_oob: file not found (ignored)
[nv8:21349] [[8664,1],1] ORTE_ERROR_LOG: Error in file ../../../../orte/mca/ess/base/ess_base_std_app.c at line 72
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_rml_base_select failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
[nv8:21349] [[8664,1],1] ORTE_ERROR_LOG: Error in file ../../../../../orte/mca/ess/env/ess_env_module.c at line 154
[nv8:21349] [[8664,1],1] ORTE_ERROR_LOG: Error in file ../../orte/runtime/orte_init.c at line 132
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_set_name failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
[nv8:21348] mca: base: component_find: unable to open /home_nfs/parma/x86_64/UNITE/packages/openmpi/1.3-intel10.1-64bit-MT-shared/lib/openmpi/mca_rml_oob: file not found (ignored)
[nv8:21348] [[8664,1],0] ORTE_ERROR_LOG: Error in file ../../../../orte/mca/ess/base/ess_base_std_app.c at line 72
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: orte_init failed
  --> Returned "Error" (-1) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[nv8:21349] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_rml_base_select failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
[nv8:21348] [[8664,1],0] ORTE_ERROR_LOG: Error in file ../../../../../orte/mca/ess/env/ess_env_module.c at line 154
[nv8:21348] [[8664,1],0] ORTE_ERROR_LOG: Error in file ../../orte/runtime/orte_init.c at line 132
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_set_name failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
[nv6:25435] mca: base: component_find: unable to open /home_nfs/parma/x86_64/UNITE/packages/openmpi/1.3-intel10.1-64bit-MT-shared/lib/openmpi/mca_rml_oob: file not found (ignored)
[nv6:25435] [[8664,1],4] ORTE_ERROR_LOG: Error in file ../../../../orte/mca/ess/base/ess_base_std_app.c at line 72
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: orte_init failed
  --> Returned "Error" (-1) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[nv8:21348] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
[nv8:21351] mca: base: component_find: unable to open /home_nfs/parma/x86_64/UNITE/packages/openmpi/1.3-intel10.1-64bit-MT-shared/lib/openmpi/mca_rml_oob: file not found (ignored)
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_rml_base_select failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
[nv6:25435] [[8664,1],4] ORTE_ERROR_LOG: Error in file ../../../../../orte/mca/ess/env/ess_env_module.c at line 154
[nv8:21351] [[8664,1],3] ORTE_ERROR_LOG: Error in file ../../../../orte/mca/ess/base/ess_base_std_app.c at line 72
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_rml_base_select failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
[nv6:25435] [[8664,1],4] ORTE_ERROR_LOG: Error in file ../../orte/runtime/orte_init.c at line 132
[nv8:21351] [[8664,1],3] ORTE_ERROR_LOG: Error in file ../../../../../orte/mca/ess/env/ess_env_module.c at line 154
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_set_name failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
[nv8:21351] [[8664,1],3] ORTE_ERROR_LOG: Error in file ../../orte/runtime/orte_init.c at line 132
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_set_name failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: orte_init failed
  --> Returned "Error" (-1) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[nv6:25435] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: orte_init failed
  --> Returned "Error" (-1) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[nv8:21351] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
[nv8:21350] mca: base: component_find: unable to open /home_nfs/parma/x86_64/UNITE/packages/openmpi/1.3-intel10.1-64bit-MT-shared/lib/openmpi/mca_rml_oob: file not found (ignored)
[nv8:21350] [[8664,1],2] ORTE_ERROR_LOG: Error in file ../../../../orte/mca/ess/base/ess_base_std_app.c at line 72
[nv6:25437] mca: base: component_find: unable to open /home_nfs/parma/x86_64/UNITE/packages/openmpi/1.3-intel10.1-64bit-MT-shared/lib/openmpi/mca_rml_oob: file not found (ignored)
[nv6:25437] [[8664,1],6] ORTE_ERROR_LOG: Error in file ../../../../orte/mca/ess/base/ess_base_std_app.c at line 72
[nv6:25436] mca: base: component_find: unable to open /home_nfs/parma/x86_64/UNITE/packages/openmpi/1.3-intel10.1-64bit-MT-shared/lib/openmpi/mca_rml_oob: file not found (ignored)
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_rml_base_select failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
[nv8:21350] [[8664,1],2] ORTE_ERROR_LOG: Error in file ../../../../../orte/mca/ess/env/ess_env_module.c at line 154
[nv6:25436] [[8664,1],5] ORTE_ERROR_LOG: Error in file ../../../../orte/mca/ess/base/ess_base_std_app.c at line 72
[nv8:21350] [[8664,1],2] ORTE_ERROR_LOG: Error in file ../../orte/runtime/orte_init.c at line 132
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_rml_base_select failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
[nv6:25437] [[8664,1],6] ORTE_ERROR_LOG: Error in file ../../../../../orte/mca/ess/env/ess_env_module.c at line 154
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_set_name failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_rml_base_select failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
[nv6:25436] [[8664,1],5] ORTE_ERROR_LOG: Error in file ../../../../../orte/mca/ess/env/ess_env_module.c at line 154
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: orte_init failed
  --> Returned "Error" (-1) instead of "Success" (0)
--------------------------------------------------------------------------
[nv6:25437] [[8664,1],6] ORTE_ERROR_LOG: Error in file ../../orte/runtime/orte_init.c at line 132
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[nv8:21350] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
[nv6:25436] [[8664,1],5] ORTE_ERROR_LOG: Error in file ../../orte/runtime/orte_init.c at line 132
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_set_name failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_set_name failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: orte_init failed
  --> Returned "Error" (-1) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[nv6:25437] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: orte_init failed
  --> Returned "Error" (-1) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[nv6:25436] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
[nv6:25438] mca: base: component_find: unable to open /home_nfs/parma/x86_64/UNITE/packages/openmpi/1.3-intel10.1-64bit-MT-shared/lib/openmpi/mca_rml_oob: file not found (ignored)
[nv6:25438] [[8664,1],7] ORTE_ERROR_LOG: Error in file ../../../../orte/mca/ess/base/ess_base_std_app.c at line 72
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_rml_base_select failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
[nv6:25438] [[8664,1],7] ORTE_ERROR_LOG: Error in file ../../../../../orte/mca/ess/env/ess_env_module.c at line 154
[nv6:25438] [[8664,1],7] ORTE_ERROR_LOG: Error in file ../../orte/runtime/orte_init.c at line 132
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_set_name failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: orte_init failed
  --> Returned "Error" (-1) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[nv6:25438] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!