Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Open MPI 1.2.4 verbosity w.r.t. osc pt2pt
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2007-12-10 12:13:59


On Oct 16, 2007, at 11:20 AM, Brian Granger wrote:

> Wow, that is quite a study of the different options. I will spend
> some time looking over things to better understand the (complex)
> situation. I will also talk with Lisandro Dalcin about what he thinks
> the best approach is for mpi4py.

Brian / Lisandro --

I don't think that I heard back from you on this issue. Would you
have major heartburn if I remove all linking of our components against
libmpi (etc.)?

(for a nicely-formatted refresher of the issues, check out https://svn.open-mpi.org/trac/ompi/wiki/Linkers)

Thanks.

> One question though. You said that
> nothing had changed in this respect from 1.2.3 to 1.2.4, but 1.2.3
> doesn't show the problem. Does this make sense?
>
> Brian
>
> On 10/16/07, Jeff Squyres <jsquyres_at_[hidden]> wrote:
>> On Oct 12, 2007, at 3:5 PM, Brian Granger wrote:
>>>> My guess is that Rmpi is dynamically loading libmpi.so, but not
>>>> specifying the RTLD_GLOBAL flag. This means that libmpi.so is not
>>>> available to the components the way it should be, and all goes
>>>> downhill from there. It only mostly works because we do something
>>>> silly with how we link most of our components, and Linux is just
>>>> smart enough to cover our rears (thankfully).
>>>
>>> In mpi4py, libmpi.so is linked in at compile time, not loaded using
>>> dlopen. Granted, the resulting mpi4py binary is loaded into python
>>> using dlopen.
>>
>> I believe that means that libmpi.so will be loaded as an indirect
>> dependency of mpi4py. See the table below.
>>
>>>> The pt2pt component (rightly) does not have a -lmpi in its link
>>>> line. The other components that use symbols in libmpi.so (wrongly)
>>>> do have a -lmpi in their link line. This can cause some problems on
>>>> some platforms (Linux tends to do dynamic linking / dynamic loading
>>>> better than most). That's why only the pt2pt component fails.
>>>
>>> Did this change from 1.2.3 to 1.2.4?
>>
>> No:
>>
>> % diff openmpi-1.2.3/ompi/mca/osc/pt2pt/Makefile.am openmpi-1.2.4/
>> ompi/mca/osc/pt2pt/Makefile.am
>> %
>>
>>>> Solutions:
>>>>
>>>> - Someone could make the pt2pt osc component link in libmpi.so
>>>> like the rest of the components and hope that no one ever
>>>> tries this on a non-friendly platform.
>>>
>>> Shouldn't the openmpi build system be able to figure this stuff
>>> out on
>>> a per platform basis?
>>
>> I believe that this would not be useful -- see the tables and
>> conclusions below.
>>
>>>> - Debian (and all Rmpi users) could configure Open MPI with the
>>>
>>>> --disable-dlopen flag and ignore the problem.
>>>
>>> Are there disadvantages to this approach?
>>
>> You won't be able to add more OMPI components to your existing
>> installation (e.g., 3rd party components). But that's probably ok,
>> at least for now -- not many people are distributing 3rd party OMPI
>> components.
>>
>>>> - Someone could fix Rmpi to dlopen libmpi.so with the RTLD_GLOBAL
>>>> flag and fix the problem properly.
>>>
>>> Again, my main problem with this solution is that it means I must
>>> both
>>> link to libmpi at compile time and load it dynamically using dlopen.
>>> This doesn't seem right. Also, it makes it impossible on OS X to
>>> avoid setting LD_LIBRARY_PATH (OS X doesn't have rpath). Being able
>>> to use openmpi without setting LD_LIBRARY_PATH is important.
>>
>> This is a very complex issue. Here's the possibilities that I see...
>> (prepare for confusion!)
>>
>> =
>> =
>> =
>> =====================================================================
>> ==
>>
>> This first table represents what happens in the following scenarios:
>>
>> - compile an application against Open MPI's libmpi, or
>> - compile an "application" DSO that is dlopen'ed with RTLD_GLOBAL, or
>> - explicitly dlopen Open MPI's libmpi with RTLD_GLOBAL
>>
>> OMPI DSO
>> libmpi OMPI DSO components
>> App linked includes components depend on
>> against components? available? libmpi.so? Result
>> ---------- ----------- ---------- ---------- ----------
>> 1. libmpi.so no no NA won't run
>> 2. libmpi.so no yes no yes
>> 3. libmpi.so no yes yes yes (*1*)
>> 4. libmpi.so yes no NA yes
>> 5. libmpi.so yes yes no maybe (*2*)
>> 6. libmpi.so yes yes yes maybe (*3*)
>> ---------- ------------ ---------- ------------ ----------
>> 7. libmpi.a no no NA won't run
>> 8. libmpi.a no yes no yes (*4*)
>> 9. libmpi.a no yes yes no (*5*)
>> 10. libmpi.a yes no NA yes
>> 11. libmpi.a yes yes no maybe (*6*)
>> 12. libmpi.a yes yes yes no (*7*)
>> ---------- ------------ ---------- ------------ --------
>>
>> All libmpi.a scenarios assume that libmpi.so is also available.
>>
>> In the OMPI v1.2 series, most components link against libmpi.so, but
>> some do not (it's our mistake for not being uniform).
>>
>> (*1*) As far as we know, this works on all platforms that have dlopen
>> (i.e., almost everywhere). But we've only tested (recently) Linux,
>> OSX, and Solaris. These 3 dynamic loaders are smart enough to
>> realize
>> that they only need to load libmpi.so once (i.e., that the implicit
>> dependency of libmpi.so brought in by the components is the same
>> libmpi.so that is already loaded), so everything works fine.
>>
>> (*2*) If the *same* component is both in libmpi and available as a
>> DSO, the same symbols will be defined twice when the component is
>> dlopen'ed and Badness will ensure. If the components are different,
>> all platforms should be ok.
>>
>> (*3*) Same caveat as (*2*) about if a components is both in libmpi
>> and
>> available as a DSO. Same as (*1*) for whether libmpi.so is loaded
>> multiple times by the dynamic loader or not.
>>
>> (*4*) Only works if the application was compiled with the equivalent
>> of the GNU linker's --whole-archive flag.
>>
>> (*5*) This does not work because libmpi.a will be loaded and
>> libmpi.so
>> will also be pulled in as a dependency of the components. As such,
>> all the data structures in libmpi will [attempt to] be in the process
>> twice: the "main libmpi" will have one set and the libmpi pulled in
>> by
>> the component dependencies will have a different set. Nothing good
>> will
>> come of that: possibly dynamic linker run-time symbol conflicts or
>> possibly two separate copies of the symbols. Both possibilities are
>> Bad.
>>
>> (*6*) Same caveat as (*2*) about if a components is both in libmpi
>> and
>> available as a DSO.
>>
>> (*7*) Same problem as (*5*).
>>
>> =
>> =
>> =
>> =====================================================================
>> ==
>>
>> This second table represents what happens in the following scenarios:
>>
>> - compile an "application" DSO that is dlopen'ed with RTLD_LOCAL, or
>> - explicitly dlopen Open MPI's libmpi with RTLD_LOCAL
>>
>> OMPI DSO
>> App libmpi OMPI DSO components
>> DSO linked includes components depend on
>> against components? available? libmpi.so? Result
>> ---------- ----------- ---------- ---------- ----------
>> 13. libmpi.so no no NA won't run
>> 14. libmpi.so no yes no no (*8*)
>> 15. libmpi.so no yes yes maybe (*9*)
>> 16. libmpi.so yes no NA ok
>> 17. libmpi.so yes yes no no (*10*)
>> 18. libmpi.so yes yes yes maybe (*11*)
>> ---------- ------------ ---------- ------------ ----------
>> 19. libmpi.a no no NA won't run
>> 20. libmpi.a no yes no no (*12*)
>> 21. libmpi.a no yes yes no (*13*)
>> 22. libmpi.a yes no NA ok
>> 23. libmpi.a yes yes no no (*14*)
>> 24. libmpi.a yes yes yes no (*15*)
>> ---------- ------------ ---------- ------------ --------
>>
>> All libmpi.a scenarios assume that libmpi.so is also available.
>>
>> (*8*) This does not work because the OMPI DSOs require symbols in
>> libmpi that will not be able to be found because libmpi.so was not
>> loaded in the global scope.
>>
>> (*9*) This is a fun case: the Linux dynamic linker is smart enough to
>> make it work, but others likely will not. What happens is that
>> libmpi.so is loaded in a LOCAL scope, but then OMPI dlopens its own
>> DSOs that require symbols from libmpi. The Linux linker figures this
>> out and resolves the required symbols from the already-loaded LOCAL
>> libmpi.so. Other linkers will fail to figure out that there is a
>> libmpi.so already loaded in the process and will therefore load a 2nd
>> copy. This results in the problems cited in (*5*).
>>
>> (*10*) This does not work either a) because of the caveat stated in
>> (*2*) or b) because the unresolved symbol issue stated in (*8*).
>>
>> (*11*) This may not work either because of the caveat stated in (*2*)
>> or because the duplicate libmpi.so issue cited in (*9*). If you are
>> using the Linux linker, then (*9*) is not an issue, and it should
>> work.
>>
>> (*12*) Essentially the same as the unresolved symbol issue cited in
>> (*8*), but with libmpi.a instead of libmpi.so.
>>
>> (*13*) Worse than (*9*); the Linux linker will not figure this one
>> out
>> because the libmpi.so symbols are not part of "libmpi" -- they are
>> simply part of the application DSO and therefore there's no way for
>> the linker to know that by loading libmpi.so, it's going to be
>> loading
>> a 2nd set of the same symbols that are already in the process.
>> Hence,
>> we devolve down to the duplicate symbol issue cited in (*5*).
>>
>> (*14*) This does not work either a) because of the caveat stated in
>> (*2*) or b) because the unresolved symbols issue stated in (*8*).
>>
>> (*15*) This may not work either because of the caveat stated in (*2*)
>> or because the duplicate libmpi.so issue cited in (*13*).
>>
>> =
>> =
>> =
>> =====================================================================
>> ==
>>
>> (I'm going to put this data on the OMPI web site somewhere because it
>> took me all day yesterday to get it straight in my head and type it
>> out :-) )
>>
>> In the OMPI v1.2 series, most OMPI configurations fall into scenarios
>> 2 and 3 (as I mentioned above, we have some components that link
>> against libmpi and others that don't -- our mistake for not being
>> consistent).
>>
>> The problematic scenario that the R and Python MPI plugins are
>> running into is 14 because the osc_pt2pt component does *not* link
>> against libmpi. Most of the rest of our components do link against
>> libmpi, and therefore fall into scenario 15, and therefore work on
>> Linux (but possibly not elsewhere).
>>
>> With all this being said, if you are looking for a general solution
>> for the Python and R plugins, dlopen() of libmpi with RTLD_GLOBAL
>> before MPI_INIT seems to be the way to go. Specifically, even if we
>> updated osc_pt2pt to link against libmpi, that will work on Linux,
>> but not elsewhere. dlopen'ing libmpi with GLOBAL seems to be the
>> most portable solution.
>>
>> Indeed, table 1 also suggests that we should change our components
>> (as Brian suggests) to all *not* link against libmpi, because then
>> we'll gain the ability to work properly with a static libmpi.a,
>> putting OMPI's common usage into scenarios 2 and 8 (which is better
>> than the 2, 3, 8, and 9 scenarios that are used today, which means we
>> don't work with libmpi.a).
>>
>> ...but I think that this would break the current R and Python plugins
>> until they put in the explicit call to dlopen().
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
Cisco Systems