Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Openmpi 1.3 problems with libtool-ltdl on CentOS 4 and 5
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2009-01-23 09:31:59


Ew. Yes, I can see this being a problem.

I'm guessing that the real issue is that OMPI embeds the libltdl from
LT 2.2.6a inside libopen_pal (one of the internal OMPI libraries).
Waving my hands a bit, but it's not hard to imagine some sort of clash
is going on between the -lltdl you added to the command line and the
libltdl that is embedded in OMPI's libraries.

Can you verify that this is what is happening?

If it is, I wonder if we should petition the LT authors to give us a
configure option to prefix all the symbols in libltdl so that we don't
get clashes like this -- similar to what we do with libevent and PLPA
(both of which are also embedded in Open MPI's internal libraries).

On Jan 23, 2009, at 9:04 AM, Roy Dragseth wrote:

> Hi, all.
>
> I do not know if this is to be considered a real bug or not, I'm just
> reporting it here so people can find it if they google around for
> the error
> message this produces. There is a backtrace at the end of this mail.
>
> Problem description:
>
> Openmpi 1.3 seems to be nonfunctional when used with libltdl in
> libtool v1.5
> that is installed on CentOS (aka RH EL) 4 and 5. Upgrading to
> libtool
> v2.2.6a (and maybe earlier versions) solves the problem. We saw
> this problem
> with both gcc and icc.
>
> Here is a code snippet that is extracted from the real application.
>
> nestcrash.c:
> #include <mpi.h>
> #include <ltdl.h>
>
> int main(int argc,char *argv[])
> {
> MPI_Init(&argc,&argv);
>
> char *dummy="dummy";
> const lt_dlhandle hModule = lt_dlopenext(dummy);
>
> }
>
> This will crash in MPI_Init when using libtool 1.5.X, if you comment
> out
> lt_dlopenext it will run normally.
>
> I can provide a complete example if neccessary.
>
> As I said earlier, upgrading to libtool 2.2.6a solved the problem
> for us.
>
> Here is the backtrace:
>
> *** Process received signal ***
> Signal: Segmentation fault (11)
> Signal code: (128)
> Failing at address: (nil)
> [ 0] /lib64/tls/libpthread.so.0 [0x3ffce0c4f0]
> [ 1] /global/apps/openmpi/1.3rc2/lib/libopen-pal.so.0 [0x2a95d4bce5]
> [ 2] /global/apps/openmpi/1.3rc2/lib/libopen-pal.so.0(lt_dlopenadvise
> +0xf0) [0x2a95d4b470]
> [ 3] /global/apps/openmpi/1.3rc2/lib/libopen-pal.so.0 [0x2a95d56e1f]
> [ 4] /global/apps/openmpi/1.3rc2/lib/libopen-
> pal.so.0(mca_base_component_find+0x58d) [0x2a95d5657d]
> [ 5] /global/apps/openmpi/1.3rc2/lib/libopen-
> pal.so.0(mca_base_components_open+0x1ae) [0x2a95d581be]
> [ 6] /global/apps/openmpi/1.3rc2/lib/libopen-
> pal.so.0(opal_paffinity_base_open+0xad) [0x2a95d73ddd]
> [ 7] /global/apps/openmpi/1.3rc2/lib/libopen-pal.so.0(opal_init+0x64)
> [0x2a95d43e64]
> [ 8] /global/apps/openmpi/1.3rc2/lib/libopen-rte.so.0(orte_init+0x1e)
> [0x2a95bdeb8e]
> [ 9] /global/apps/openmpi/1.3rc2/lib/libmpi.so.0 [0x2a95a38fee]
> [10] /global/apps/openmpi/1.3rc2/lib/libmpi.so.0(PMPI_Init_thread
> +0x72)
> [0x2a95a5b9c2]
> [11] nest-ompi_1.3rc2/bin/nest(_ZN4nest12Communicator4initEPiPPPc
> +0x11f)
> [0x55440f]
> [12] nest-ompi_1.3rc2/bin/nest(main+0x74) [0x4a7674]
> [13] /lib64/tls/libc.so.6(__libc_start_main+0xdb) [0x339271c3fb]
> [14] nest-ompi_1.3rc2/bin/nest(_ZNSt8ios_base4InitD1Ev+0x5a)
> [0x4a756a]
> *** End of error message ***
>
>
>
> --
>
> The Computer Center, University of Tromsø, N-9037 TROMSØ Norway.
> phone:+47 77 64 41 07, fax:+47 77 64 41 00
> Roy Dragseth, Team Leader, High Performance Computing
> Direct call: +47 77 64 62 56. email: roy.dragseth_at_[hidden]
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
Cisco Systems