Open MPI logo

Hardware Locality Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: [hwloc-devel] plugins inside plugin broken, as expected
From: Brice Goglin (Brice.Goglin_at_[hidden])
Date: 2013-06-03 04:45:47


Hello,

I recently got the first report of what we knew would happen one day or
another: plugin namespace issues caused by somebody loading a
plugin-enabled hwloc as a plugin. It comes from OpenCL (which uses
plugins to select implementations) because one implementation depends on
hwloc. What happens is that hwloc fails to load its plugins because they
need some functions from the hwloc core, but they cannot find them
because hwloc was loaded in a private namespace within a OpenCL plugin.

What's annoying is that the program completely seems to load plugins
fine but later aborts at use-time because of the missing symbol (and
there's no portable/easy way to force load-time lookup from what I see
in the ltdl documentation).

One easy workaround is to set HWLOC_PLUGINS_PATH=/none in the
environment, so that no hwloc plugin is found. But this may remove some
features.

The proper fix for now is to rebuild hwloc without plugins. So we don't
have to hurry and fix this for v1.7.2, but we can still look at it for v1.8.

Two solutions were envisioned earlier:
* Have hwloc plugins depend on libhwloc. Jeff didn't like it because it
will cause multiple instances of libhwloc to be loaded, which will break
if we have internal/global state in libhwloc. I think we actually have
no such internal state, but this way may still be dangerous.
* Have the core tell plugins where core symbols are. Basically means
doing our own symbol lookup manually. Possible issues:
  + We have maaaaaaany symbols, it's not easy to define which ones are
available to plugins and which ones are not. Quick look [1].
  + Plugins won't be able to call hwloc functions directly anymore, and
they won't be able to use inline helpers anymore (since those often call
hwloc core functions explicitly).
  + Need to implement that without causing future ABI breaks when
extending to API that is available to plugins. Maybe have plugins pass
an array of strings listing which symbols they need.

Other ideas?

Brice

[1] Review of public symbols:

Things that shouldn't be available to plugins:
* init/load/destroy
* topology_set_*() topology_ignore_*() topology_restrict()
* XML export/import
* cpubind/membind/last_cpu_location (as well alloc/free)
* custom_insert_*

Things that should be available:
* hwloc/plugins.h
* other insert() functions (not sure)
* most of our get() functions
* most stringification functions
* minor other things
(about 30 total)

hwloc/bitmap.h is the biggest problem, plugins should be allowed to use
all of them but there are maaaaany of them. Splitting hwloc-bitmap.so
out of hwloc.so would be an easy way to solve this. The bitmap API is
totally independent from the hwloc core anyway.

Brice