Sorry for taking up this old thread, but I think the solution is not yet
To summarize the problem: OpenMPI has a plugin architecture. The plugins
rely on the fact, that the OpenMPI library is loaded into the global
namespace and are accessible to the plugins. If the mpi lib is
dynamically loaded into a private namespace (as for example when using
it in a python module), the plugins can't find the symbols of the library.
So far, the suggested solution is, that the OpenMPI users should open
libmpi.so into the global namespace to avoid the problem, or to compile
OpenMPI using --enable-shared --enable-static. Both approaches have
their problems, that I detail below.
What I do not really get is why not to solve the problem on the side of
OpenMPI. As far as I see it, this problem has already been discussed here:
and the solution that is described there still looks as though it should
still work now, or shouldn't it? Just link all the OpenMPI plugins
against the base OpenMPI libraries, and it should work. Or am I wrong?
The problems with the suggested solutions:
* Opening libmpi into the global namespace has exactly the problems that
come with loading symbols into the global namespace. After all, there is
some sense in not putting all symbols into the global namespace...
* Furthermore, it requires the modification of the program/plugin
loading the mpi library. In some cases, it might not be simple to do
this modification, as it would have to be done in a package outside of
the scope of the user. After all, some packages might decide better to
ignore OpenMPI than to adapt their code to OpenMPI. So, I think it would
be the best solution if OpenMPI would try to be as compatible to other
MPI implementations as possible.
* On many medium-size clusters, it is not easily possible for a user to
install their own version of MPI, and the admins are often reluctant to
install anything which is not of the shelf. Therefore, if it is
necessary to compile OpenMPI with non-default flags to make it work with
some plugin-enabled programs, I would guess, that this simply won't
happen on many of this type of clusters.