Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Missing Symbol
From: George Bosilca (bosilca_at_[hidden])
Date: 2010-03-05 14:41:53


Because I guess it is declared by another module loaded dynamically at runtime. As libtool load the symbols not in a global scope, this mca_pml_v will not be visible for other modules trying to use it.

  george.

On Mar 5, 2010, at 14:35 , Leonardo Fialho wrote:

> No George, this trick does not change the problem. I'm looking for the problem in the mca_pml_v declaration, but I still can't figure out the reason why it doesn't work.
>
> Leonardo
>
> On Mar 5, 2010, at 8:12 PM, George Bosilca wrote:
>
>> I would first try the Open MPI configure option --disable-visibility. If this doesn't fix it, you should make sure that dlopen is called with the GLOBAL flag on (don't remember where exactly in the code and unfortunately I can't check right now). Use gdb and set a breakpoint to dlopen and you will find it.
>>
>> george.
>>
>> On Mar 5, 2010, at 14:00 , Leonardo Fialho wrote:
>>
>>> Yeah, probably ompi_request_null and opal_output are not good candidates. I'm trying with mca_pml_v. But I'm not familiarized with this framework although it is really small.
>>>
>>> George, you said to change this (opal/mca/base/mca_base_component_find.c):
>>>
>>> #if OPAL_HAVE_LTDL_ADVISE
>>> component_handle = lt_dlopenadvise(target_file->filename, opal_mca_dladvise);
>>> #else
>>> component_handle = lt_dlopenext(target_file->filename);
>>> #endif
>>>
>>> to use lt_dladvise_global instead of lt_dladvise_local?
>>>
>>> Leonardo
>>>
>>> On Mar 5, 2010, at 7:47 PM, Terry Dontje wrote:
>>>
>>>> I would also start nm'ing the .so's you think the U symbols are resolved in to make sure they are exposed. Luckily you only have 3 symbols to look for.
>>>>
>>>> --td
>>>>
>>>> Ralph Castain wrote:
>>>>> It's probably a visibility issue - check for an OMPI_DECLSPEC missing from the declaration of a symbol.
>>>>>
>>>>> On Mar 5, 2010, at 11:40 AM, Leonardo Fialho wrote:
>>>>>
>>>>>
>>>>>> Yes,
>>>>>>
>>>>>> I renamed all references to Aurelien's componant name and removed all code regarding to the component itself. There are only functions which returns OMPI_SUCCESS. No other function is called.
>>>>>>
>>>>>> I'm debugging with LD_DEBUG=symbols, but the output is really huge! Probably the error is in the mca_pml_v symbol:
>>>>>>
>>>>>> 19643: /home/lfialho/lib/openmpi/mca_vprotocol_receiver.so: error: symbol lookup error: undefined symbol: mca_pml_v (fatal)
>>>>>>
>>>>>> Leonardo
>>>>>>
>>>>>> On Mar 5, 2010, at 7:35 PM, Ralph Castain wrote:
>>>>>>
>>>>>>
>>>>>>> You said this component was a copy of Aurelien's component? Did you rename the critical elements (e.g., component, module) inside it to avoid name confusion?
>>>>>>>
>>>>>>> On Mar 5, 2010, at 11:27 AM, Leonardo Fialho wrote:
>>>>>>>
>>>>>>>
>>>>>>>> I see... but it is really strange because this module is clean, it does not use nothing. This is the output of the nm command, I can't see any symbol which is not available.
>>>>>>>>
>>>>>>>> [lfialho_at_aoclsb-clus openmpi]$ nm mca_vprotocol_receiver.so 0000000000201208 a _DYNAMIC
>>>>>>>> 0000000000201408 a _GLOBAL_OFFSET_TABLE_
>>>>>>>> w _Jv_RegisterClasses
>>>>>>>> 00000000002011e0 d __CTOR_END__
>>>>>>>> 00000000002011d8 d __CTOR_LIST__
>>>>>>>> 00000000002011f0 d __DTOR_END__
>>>>>>>> 00000000002011e8 d __DTOR_LIST__
>>>>>>>> 00000000000011d0 r __FRAME_END__
>>>>>>>> 00000000002011f8 d __JCR_END__
>>>>>>>> 00000000002011f8 d __JCR_LIST__
>>>>>>>> 0000000000201640 A __bss_start
>>>>>>>> w __cxa_finalize@@GLIBC_2.2.5
>>>>>>>> 0000000000000d40 t __do_global_ctors_aux
>>>>>>>> 00000000000007c0 t __do_global_dtors_aux
>>>>>>>> 0000000000201200 d __dso_handle
>>>>>>>> w __gmon_start__
>>>>>>>> 0000000000201640 A _edata
>>>>>>>> 0000000000201648 A _end
>>>>>>>> 0000000000000d78 T _fini
>>>>>>>> 0000000000000750 T _init
>>>>>>>> 00000000000007a0 t call_gmon_start
>>>>>>>> 0000000000201640 b completed.6115
>>>>>>>> 0000000000000810 t frame_dummy
>>>>>>>> U mca_pml_v
>>>>>>>> 0000000000201460 D mca_vprotocol_receiver
>>>>>>>> 0000000000000c71 t mca_vprotocol_receiver_add_comm
>>>>>>>> 0000000000000a5f t mca_vprotocol_receiver_add_procs
>>>>>>>> 0000000000201540 D mca_vprotocol_receiver_component
>>>>>>>> 0000000000000cc3 t mca_vprotocol_receiver_component_close
>>>>>>>> 0000000000000d18 t mca_vprotocol_receiver_component_finalize
>>>>>>>> 0000000000000cce t mca_vprotocol_receiver_component_init
>>>>>>>> 0000000000000cb8 t mca_vprotocol_receiver_component_open
>>>>>>>> 0000000000000c93 t mca_vprotocol_receiver_del_comm
>>>>>>>> 0000000000000a89 t mca_vprotocol_receiver_del_procs
>>>>>>>> 000000000000083c t mca_vprotocol_receiver_dump
>>>>>>>> 0000000000000d23 t mca_vprotocol_receiver_enable
>>>>>>>> 00000000000009e7 t mca_vprotocol_receiver_iprobe
>>>>>>>> 0000000000000b9a t mca_vprotocol_receiver_irecv
>>>>>>>> 0000000000000ab3 t mca_vprotocol_receiver_isend
>>>>>>>> 0000000000000a29 t mca_vprotocol_receiver_probe
>>>>>>>> 0000000000000c00 t mca_vprotocol_receiver_recv
>>>>>>>> 0000000000000b21 t mca_vprotocol_receiver_send
>>>>>>>> 00000000000009bd T mca_vprotocol_receiver_start
>>>>>>>> 0000000000000864 t mca_vprotocol_receiver_test
>>>>>>>> 0000000000000896 t mca_vprotocol_receiver_test_all
>>>>>>>> 00000000000008d0 t mca_vprotocol_receiver_test_any
>>>>>>>> 0000000000000950 t mca_vprotocol_receiver_test_some
>>>>>>>> 0000000000000916 t mca_vprotocol_receiver_wait_any
>>>>>>>> 000000000000098a t mca_vprotocol_receiver_wait_some
>>>>>>>> U ompi_request_null
>>>>>>>> U opal_output
>>>>>>>> 0000000000201440 d p.6113
>>>>>>>> [lfialho_at_aoclsb-clus openmpi]$
>>>>>>>>
>>>>>>>> On Mar 5, 2010, at 7:00 PM, Terry Dontje wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>> Sorry meant to add this, but you might be able to try and find the symbol causing the issue by twiddling with LD_DEBUG
>>>>>>>>>
>>>>>>>>> --td
>>>>>>>>> Terry Dontje wrote:
>>>>>>>>>
>>>>>>>>>> Possibly there is an external symbol in the .so that is being loaded that cannot be resolved.
>>>>>>>>>> --td
>>>>>>>>>> Leonardo Fialho wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> I know that libtool does not help us to find the source of this error, but, what can generate the following error?
>>>>>>>>>>>
>>>>>>>>>>> [aoclsb-clus.uab.es:11724] mca: base: component_find: unable to open /home/lfialho/lib/openmpi/mca_vprotocol_receiver: perhaps a missing symbol, or compiled for a different version of Open MPI? (ignored)
>>>>>>>>>>>
>>>>>>>>>>> 1) yes, the file exists
>>>>>>>>>>> 2) yes, it has been compiled among all other components
>>>>>>>>>>> 3) yes, it is the same Open MPI version
>>>>>>>>>>> 4) this component is a copy of the pessimist component implemented by Aurelien
>>>>>>>>>>> 5) Aurelien's component presents the same error
>>>>>>>>>>>
>>>>>>>>>>> The question is: what mistake should generate an error during module loading?
>>>>>>>>>>>
>>>>>>>>>>> Thanks in advance,
>>>>>>>>>>> Leonardo
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> devel mailing list
>>>>>>>>>>> devel_at_[hidden]
>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> devel mailing list
>>>>>>>>>> devel_at_[hidden]
>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> devel mailing list
>>>>>>>>> devel_at_[hidden]
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> devel mailing list
>>>>>>>> devel_at_[hidden]
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> devel mailing list
>>>>>>> devel_at_[hidden]
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> devel_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> devel_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel