Thanks jeff, that clears everything .
Now I remember,few time ago I came up with a issue like this when using
Dynamic loading (dlopen,dlsym..etc) and later I had to use shared
libraries.. I think ,this is same as that.
Jeff Squyres wrote:
> On Sep 10, 2009, at 9:42 PM, Ashika Umanga Umagiliya wrote:
>> That fixed the problem !
>> You are indeed a voodoo master... could you explain the spell behind
>> your magic :)
> The problem has to do with how plugins (aka dynamic shared objects,
> DSO's) are loaded. When a DSO is loaded into a Linux process, it has
> the option of making all the public symbols in that DSO public to the
> rest of the process or private within its own scope.
> Let's back up. Remember that Open MPI is based on plugins (DSO's).
> It loads lots and lots of plugins during execution (mostly during
> MPI_INIT). These plugins call functions in OMPI's public libraries
> (e.g., they call functions in libmpi.so). Hence, when the plugin
> DSO's are loaded, they need to be able to resolve these symbols into
> actual code that can be invoked. If the symbols cannot be resolved,
> the DSO load fails.
> If libParallel.so is loaded into a private scope, then its linked
> libraries (e.g., libmpi.so) are also loaded into that same private
> scope. Hence, all of libmpi.so's public symbols are only public
> within that single, private scope. Then, when OMPI goes to load its
> own DSOs, since libmpi.so's public symbols are in a private scope,
> OMPI's DSO's can't find them -- and therefore they refuse to load.
> (private scopes are not inherited -- a new DSO load cannot "see"
> libParallel.so/libmpi.so's private scope).
> It's an educated guess from your description that this is what was
> OMPI's --disable-dlopen configure option has Open MPI build in a
> different way. Instead of building all of OMPI's plugins as DSOs,
> they are "slurped" up into libmpi.so (etc.). So there's no "loading"
> of DSOs at MPI_INIT time -- the plugin code actually resides *in*
> libmpi.so itself. Hence, resolution of all symbols is done when
> libParallel.so loads libmpi.so. Additionally, there's no secondary
> private scope created when DSOs are loaded -- they're all
> self-contained within libmpi.so (etc.). And therefore all the
> libmpi.so symbols that are required for the plugins are all able to be
> found/resolved at load time.
> Does that make sense?
>> Jeff Squyres wrote:
>> > I'm guessing that this has to do with deep, dark voodoo involved with
>> > the run time linker.
>> > Can you try configuring/building Open MPI with --disable-dlopen
>> > configure option, and rebuilding your libParallel.so against the new
>> > libmpi.so?
>> > See if that fixes the problem for you. If it does, I can explain in
>> > more detail (if you care).
>> > On Sep 10, 2009, at 3:24 AM, Ashika Umanga Umagiliya wrote:
>> >> Greetings all,
>> >> My parallel application is build as a shared library
>> >> (I use Debian Lenny 64bit).
>> >> A webservice is used to dynamically load libParallel.so and inturn
>> >> execute the parallel process .
>> >> But during runtime I get the error :
>> >> webservicestub: symbol lookup error:
>> >> /usr/local/lib/openmpi/mca_paffinity_linux.so: undefined symbol:
>> >> mca_base_param_reg_int
>> >> which I cannot figure out.I followed every 'ldd' and 'nm' seems
>> >> everything is fine.
>> >> So I compiled and tested my parallel code as an executable and
>> then it
>> >> worked fine.
>> >> What could be the reason for this?
>> >> Thanks in advance,
>> >> umanga
>> >> _______________________________________________
>> >> users mailing list
>> >> users_at_[hidden]
>> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> users mailing list