Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] libtool issue with crs/self
From: Brian W. Barrett (brbarret_at_[hidden])
Date: 2009-08-05 11:35:50


Josh -

Just in case it wasn't clear -- if you're only looking for a symbol in the
executable (which you know is there), you do *NOT* have to dlopen() the
executable first (you do with libtool to support the "i don't have dynamic
library support" mode of operatoin). You only have to dlsym() with
RTLD_DEFAULT, as the symbol is already in the process space.

It does probably mean we can't support self on platforms without dlsym(),
but that set is extremely small and since we don't use libtool to link the
final executable, the lt_dlsym wrappers wouldn't have worked anyway.

Brian

On Wed, 5 Aug 2009, George Bosilca wrote:

> Josh,
>
> These look like two different issues to me. One is how some modules from Open
> MPI can use the libltld, and for this you highlighted the issue. The second
> is that the users who want to use the self CRS have to make sure the symbols
> required by self CRS are visible in their application. This is clearly an
> item for the FAQ.
>
> george.
>
> On Aug 5, 2009, at 10:51 , Josh Hursey wrote:
>
>> As an update on this thread. I had a bit of time this morning to look into
>> this.
>>
>> I noticed that the "-fvisibility=hidden" option when passed to libltdl will
>> cause it to fail in its configure test for:
>> "checking whether a program can dlopen itself"
>> This is because the symbol they are trying to look for with dlsym() is not
>> postfixed with:
>> __attribute__ ((visibility("default")))
>> If I do that, then the test passes correctly.
>>
>> I am not sure if this is a configure bug in Libtool or not. But what it
>> means is that even with the wrapper around the OPAL libltdl routines, it is
>> not useful to me since I need to open the executable to examine it for the
>> necessary symbols.
>>
>> So I might try to go down the track of using dlopen/dlsym/dlclose directly
>> instead of through the libtool interfaces. However I just wanted to mention
>> that this is happening in case there are other places in the codebase that
>> ever want to look into the executable for symbols, and find that
>> lt_dlopen() fails in non-obvious ways.
>>
>> -- Josh
>>
>> On Jul 29, 2009, at 11:01 AM, Brian W. Barrett wrote:
>>
>>> Never mind, I'm an idiot. I still don't like the wrappers around
>>> lt_dlopen in util, but it might be your best option. Are you looking for
>>> symbols in components or the executable? I assumed the executable, in
>>> which case you might be better off just using dlsym() directly. If you're
>>> looking for a symbol first place it's found, then you can just do:
>>>
>>> dlsym(RTLD_DEFAULT, symbol);
>>>
>>> The lt_dlsym only really helps if you're running on really obscure
>>> platforms which don't support dlsym and loading "preloaded" components.
>>>
>>> Brian
>>>
>>> On Wed, 29 Jul 2009, Brian W. Barrett wrote:
>>>
>>>> What are you trying to do with lt_dlopen? It seems like you should
>>>> always go through the MCA base utilities. If one's missing, adding it
>>>> there seems like the right mechanism.
>>>>
>>>> Brian
>>>>
>>>> On Wed, 29 Jul 2009, Josh Hursey wrote:
>>>>
>>>>> George suggested that to me as well yesterday after the meeting. So we
>>>>> would create opal interfaces to libtool (similar to what we do with the
>>>>> event engine). That might be the best way to approach this.
>>>>> I'll start to take a look at implementing this. Since opal/libltdl is
>>>>> not part of the repository, is there a 'right' place to put this header?
>>>>> maybe in opal/util/?
>>>>> Thanks,
>>>>> Josh
>>>>> On Jul 28, 2009, at 6:57 PM, Jeff Squyres (jsquyres) wrote:
>>>>>> Josh - this is almost certainly what happened. Yoibks. Unfortunately,
>>>>>> there's good reasons for it. :(
>>>>>> What about if we proxy calls to lt_dlopen through an opal function
>>>>>> call?
>>>>>> -jms
>>>>>> Sent from my PDA. No type good.
>>>>>> ----- Original Message -----
>>>>>> From: devel-bounces_at_[hidden] <devel-bounces_at_[hidden]>
>>>>>> To: Open MPI Developers <devel_at_[hidden]>
>>>>>> Sent: Tue Jul 28 16:39:42 2009
>>>>>> Subject: Re: [OMPI devel] libtool issue with crs/self
>>>>>> It was mentioned to me that r21731 might have caused this problem by
>>>>>> restricting the visibility of the libltdl library.
>>>>>> https://svn.open-mpi.org/trac/ompi/changeset/21731
>>>>>> Brian,
>>>>>> Do you have any thoughts on how we might extend the visibility so that
>>>>>> MCA components could also use the libtool in opal?
>>>>>> I can try to initialize libtool in the Self CRS component and use it
>>>>>> directly, but since it is already opened by OPAL, I think it might be
>>>>>> better to use the instantiation in OPAL.
>>>>>> Cheers,
>>>>>> Josh
>>>>>> On Jul 28, 2009, at 3:06 PM, Josh Hursey wrote:
>>>>>>> Once upon a time, the Self CRS module worked correctly, but I admit
>>>>>>> that I have not tested it in a long time.
>>>>>>> The Self CRS component uses dl_open and friends to inspect the
>>>>>>> running process for a particular set of functions. When I try to run
>>>>>>> an MPI program that contains these signatures I get the following
>>>>>>> error when it tries to resolve lt_dlopen() in
>>>>>>> opal_crs_self_component_query():
>>>>>>> ------------------
>>>>>>> my-app: symbol lookup error: /path/to/install/lib/openmpi/
>>>>>>> mca_crs_self.so: undefined symbol: lt_dlopen
>>>>>>> ------------------
>>>>>>> I am configuring with the following:
>>>>>>> ------------------
>>>>>>> ./configure --prefix=/path/to/install \
>>>>>>> --enable-binaries \
>>>>>>> --with-devel-headers \
>>>>>>> --enable-debug \
>>>>>>> --enable-mpi-threads \
>>>>>>> --with-ft=cr \
>>>>>>> --without-memory-manager \
>>>>>>> --enable-ft-thread \
>>>>>>> CC=gcc CXX=g++ \
>>>>>>> F77=gfortran FC=gfortran
>>>>>>> ------------------
>>>>>>> The source code is at the link below:
>>>>>>> https://svn.open-mpi.org/trac/ompi/browser/trunk/opal/mca/crs/self
>>>>>>> Does anyone have any thoughts on what might be going wrong here?
>>>>>>> Thanks,
>>>>>>> Josh
>>>>>>> _______________________________________________
>>>>>>> devel mailing list
>>>>>>> devel_at_[hidden]
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> devel_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> devel_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>