I investigated the ibverbs configury issue reported by Paul Hargrove (initial post: http://www.open-mpi.org/community/lists/devel/2014/01/13598.php), and it looks like it's an oshmem configury issue. The short version is that oshmem is doing some configure tests a) at the wrong time, and b) in the wrong place.
Both things are happening in OSHMEM_SETUP_CFLAGS, which is being invoked very, very late in configure.ac:
a) OSHMEM_SETUP_CFLAGS is after all framework/component setup, and is during final *FLAGS (e.g., CFLAGS) processing. In this case, LDFLAGS has been loaded with -export-dynamic, which is intended to be used by libtool. But then OSHMEM_SETUP_CFLAGS invokes tests that use LDFLAGS with plain CC, and badness can occur.
b) But I'm confused as to the purpose of OSHMEM_SETUP_CFLAGS, anyway:
b1) It's calling OMPI_C_COMPILER_VENDOR([oshmem_c_vendor]). But I can't find where this is used. Am I missing it? If not, it should be removed.
b2) The rest of OSHMEM_SETUP_CFLAGS is all verbs-specific (e.g., it calls OMPI_CHECK_OPENFABRICS). It looks like the flags and #define it sets are in the mca/memheap/base. Two issues:
b2a) Tests that are specific to a framework should be in that framework's configure.m4 (e.g., oshmem/mca/memheap/configure.m4). They should not (effectively) be in the top-level configure.ac.
b2b) Why is all this verbs-specific stuff in the memheap base? It seems like an abstraction violation -- the whole point of components is to have platform-specific code in components, not in the core/base library. Put simply: as a rule of thumb, you shouldn't need to link libibverbs -- or any other network stack library -- in the wrapper compilers (when building libmpi as shared library with plugins). If you do, it means you have network-specific code in OMPI's core libraries, and you got the abstractions wrong.
>From how I'm currently understanding this, it seems like OSHMEM_SETUP_CFLAGS should go away, the tests it is doing should move to a component's configure.m4, and the verbs-specific code in memheap/base should also move to a component.
Am I misunderstanding this? Can you explain this in more detail?
If it would be helpful, we can discuss this on a webex next week, or somesuch.
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/