Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] libtool issue with crs/self
From: Josh Hursey (jjhursey_at_[hidden])
Date: 2009-08-05 12:29:25

On Aug 5, 2009, at 11:35 AM, Brian W. Barrett wrote:

> Josh -
> Just in case it wasn't clear -- if you're only looking for a symbol
> in the executable (which you know is there), you do *NOT* have to
> dlopen() the executable first (you do with libtool to support the "i
> don't have dynamic library support" mode of operatoin). You only
> have to dlsym() with RTLD_DEFAULT, as the symbol is already in the
> process space.

So is it wrong to dlopen() before dlsym()? The patch I just committed
in r21766 does this, since I was following the man page for dlopen()
to make sure I was using it correctly.

> It does probably mean we can't support self on platforms without
> dlsym(), but that set is extremely small and since we don't use
> libtool to link the final executable, the lt_dlsym wrappers wouldn't
> have worked anyway.

yeah I put a note to this effect in the commit messages. Not very many
people use the 'self' component at the moment, and I think this patch
will work for what they need. However if it ever doesn't then we/I can
come back to revisit this issue.


> Brian
> On Wed, 5 Aug 2009, George Bosilca wrote:
>> Josh,
>> These look like two different issues to me. One is how some modules
>> from Open MPI can use the libltld, and for this you highlighted the
>> issue. The second is that the users who want to use the self CRS
>> have to make sure the symbols required by self CRS are visible in
>> their application. This is clearly an item for the FAQ.
>> george.
>> On Aug 5, 2009, at 10:51 , Josh Hursey wrote:
>>> As an update on this thread. I had a bit of time this morning to
>>> look into this.
>>> I noticed that the "-fvisibility=hidden" option when passed to
>>> libltdl will cause it to fail in its configure test for:
>>> "checking whether a program can dlopen itself"
>>> This is because the symbol they are trying to look for with
>>> dlsym() is not postfixed with:
>>> __attribute__ ((visibility("default")))
>>> If I do that, then the test passes correctly.
>>> I am not sure if this is a configure bug in Libtool or not. But
>>> what it means is that even with the wrapper around the OPAL
>>> libltdl routines, it is not useful to me since I need to open the
>>> executable to examine it for the necessary symbols.
>>> So I might try to go down the track of using dlopen/dlsym/dlclose
>>> directly instead of through the libtool interfaces. However I just
>>> wanted to mention that this is happening in case there are other
>>> places in the codebase that ever want to look into the executable
>>> for symbols, and find that lt_dlopen() fails in non-obvious ways.
>>> -- Josh
>>> On Jul 29, 2009, at 11:01 AM, Brian W. Barrett wrote:
>>>> Never mind, I'm an idiot. I still don't like the wrappers around
>>>> lt_dlopen in util, but it might be your best option. Are you
>>>> looking for symbols in components or the executable? I assumed
>>>> the executable, in which case you might be better off just using
>>>> dlsym() directly. If you're looking for a symbol first place
>>>> it's found, then you can just do:
>>>> dlsym(RTLD_DEFAULT, symbol);
>>>> The lt_dlsym only really helps if you're running on really
>>>> obscure platforms which don't support dlsym and loading
>>>> "preloaded" components.
>>>> Brian
>>>> On Wed, 29 Jul 2009, Brian W. Barrett wrote:
>>>>> What are you trying to do with lt_dlopen? It seems like you
>>>>> should always go through the MCA base utilities. If one's
>>>>> missing, adding it there seems like the right mechanism.
>>>>> Brian
>>>>> On Wed, 29 Jul 2009, Josh Hursey wrote:
>>>>>> George suggested that to me as well yesterday after the
>>>>>> meeting. So we would create opal interfaces to libtool (similar
>>>>>> to what we do with the event engine). That might be the best
>>>>>> way to approach this.
>>>>>> I'll start to take a look at implementing this. Since opal/
>>>>>> libltdl is not part of the repository, is there a 'right' place
>>>>>> to put this header? maybe in opal/util/?
>>>>>> Thanks,
>>>>>> Josh
>>>>>> On Jul 28, 2009, at 6:57 PM, Jeff Squyres (jsquyres) wrote:
>>>>>>> Josh - this is almost certainly what happened. Yoibks.
>>>>>>> Unfortunately, there's good reasons for it. :(
>>>>>>> What about if we proxy calls to lt_dlopen through an opal
>>>>>>> function call?
>>>>>>> -jms
>>>>>>> Sent from my PDA. No type good.
>>>>>>> ----- Original Message -----
>>>>>>> From: devel-bounces_at_[hidden] <devel-bounces_at_[hidden]>
>>>>>>> To: Open MPI Developers <devel_at_[hidden]>
>>>>>>> Sent: Tue Jul 28 16:39:42 2009
>>>>>>> Subject: Re: [OMPI devel] libtool issue with crs/self
>>>>>>> It was mentioned to me that r21731 might have caused this
>>>>>>> problem by
>>>>>>> restricting the visibility of the libltdl library.
>>>>>>> Brian,
>>>>>>> Do you have any thoughts on how we might extend the visibility
>>>>>>> so that
>>>>>>> MCA components could also use the libtool in opal?
>>>>>>> I can try to initialize libtool in the Self CRS component and
>>>>>>> use it
>>>>>>> directly, but since it is already opened by OPAL, I think it
>>>>>>> might be
>>>>>>> better to use the instantiation in OPAL.
>>>>>>> Cheers,
>>>>>>> Josh
>>>>>>> On Jul 28, 2009, at 3:06 PM, Josh Hursey wrote:
>>>>>>>> Once upon a time, the Self CRS module worked correctly, but I
>>>>>>>> admit
>>>>>>>> that I have not tested it in a long time.
>>>>>>>> The Self CRS component uses dl_open and friends to inspect the
>>>>>>>> running process for a particular set of functions. When I try
>>>>>>>> to run
>>>>>>>> an MPI program that contains these signatures I get the
>>>>>>>> following
>>>>>>>> error when it tries to resolve lt_dlopen() in
>>>>>>>> opal_crs_self_component_query():
>>>>>>>> ------------------
>>>>>>>> my-app: symbol lookup error: /path/to/install/lib/openmpi/
>>>>>>>> undefined symbol: lt_dlopen
>>>>>>>> ------------------
>>>>>>>> I am configuring with the following:
>>>>>>>> ------------------
>>>>>>>> ./configure --prefix=/path/to/install \
>>>>>>>> --enable-binaries \
>>>>>>>> --with-devel-headers \
>>>>>>>> --enable-debug \
>>>>>>>> --enable-mpi-threads \
>>>>>>>> --with-ft=cr \
>>>>>>>> --without-memory-manager \
>>>>>>>> --enable-ft-thread \
>>>>>>>> CC=gcc CXX=g++ \
>>>>>>>> F77=gfortran FC=gfortran
>>>>>>>> ------------------
>>>>>>>> The source code is at the link below:
>>>>>>>> Does anyone have any thoughts on what might be going wrong
>>>>>>>> here?
>>>>>>>> Thanks,
>>>>>>>> Josh
>>>>>>>> _______________________________________________
>>>>>>>> devel mailing list
>>>>>>>> devel_at_[hidden]
>>>>>>> _______________________________________________
>>>>>>> devel mailing list
>>>>>>> devel_at_[hidden]
>>>>>>> _______________________________________________
>>>>>>> devel mailing list
>>>>>>> devel_at_[hidden]
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> devel_at_[hidden]
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
> _______________________________________________
> devel mailing list
> devel_at_[hidden]