It depends on how Open MPI was built.
If Open MPI was built without plugins (i.e., all the plugins are slurped up into libmpi and friends), then yes, applications need to link against librdmacm to use the RDMA CM mode of OpenFabrics transport.
If Open MPI was built with plugins (which is the default), then apps don't need to link against librdmacm because the only use of rdmacm is in an Open MPI plugin, and that plugin was linked against librdmacm.
That being said, the output from mpic++ --showme should give you something that is directly compile-/link-able. So it is odd if mpic++ is showing you something that can't (or shouldn't?) be done. Did a -L argument get lost somewhere, perchance?
Does linking MPI applications with mpic++ work properly, or does it result in the same error? If it results in the same error, then perhaps something has changed since Open MPI was installed...?
All this being said, two other random points:
1. Ensure that you're using the "right" mpic++. I.e., make sure it matches the version/installation of Open MPI that you're trying to use.
2. If you don't link with the librdmacm, you're probably not losing any important functionality unless you have an iWarp-based cluster (that's the only transport that *needs* librdmacm). IB-based networks can use librdmacm, but don't *need* it (it's only used for making initial connections, so using librdmacm or not has no implications on overall MPI performance). It's still odd that mpic++ wants it and it can't be found, though...
Does that helps?
On Dec 15, 2009, at 11:11 PM, tom fogal wrote:
> Simon Su <newsgroup4ssu_at_[hidden]> writes:
> > Hi Tom,
> > I am using the standard openmpi package that run on all the cluster
> > machines here at Princeton. So, maybe I shouldn't touch openmpi. But,
> > removing -lrdmacm from the MPI_LIBS line in the machinename.conf file
> > worked. Any implication from doing this?
> The only thing it could possibly do is disable RDMA for you. However,
> since removing it did not produce any undefined symbol errors, my guess
> is that your OpenMPI isn't using RDMA anyway.
> There might be an OpenMPI bug here, though. I've cc'd the OpenMPI
> community to see if they have any input. As a summary for them: Simon
> is trying to build our MPI-enabled application. A script which tries
> to automate this adds the output of "mpic++ -show". His build then
> failed because it attempted to link against librdmacm, which does not
> exist in his normal search paths (or maybe at all). Is it possible
> that `mpic++ -show' includes/adds "-lrdmacm" even when OpenMPI is not
> itself using the library?
> > On Tue, Dec 15, 2009 at 8:46 PM, tom fogal <tfogal_at_[hidden]> wrote:
> > > Simon Su <newsgroup4ssu_at_[hidden]> writes:
> > > > I am getting this error message while building 9184.
> > > [snip]
> > > > -lz -lm -ldl -lpthread -L/usr/local/openmpi/1.3.3/gcc/x86_64/lib64
> > > > -lmpi_cxx -lmpi -lopen-rte -lopen-pal -lrdmacm -libverbs -lnuma -ldl
> > > -lnsl
> > > > -lutil -lm -lcognomen \
> > > > -L/usr/local/openmpi/1.3.3/gcc/x86_64/lib64 -lmpi_cxx -lmpi
> > > > -lopen-rte -lopen-pal -lrdmacm -libverbs -lnuma -ldl -lnsl -lutil -lm
> > > > -lcognomen
> > > > /usr/bin/ld: cannot find -lrdmacm
> > > > collect2: ld returned 1 exit status
> > >
> > > Your OpenMPI install (incorrectly?) thinks it has librdmacm available,
> > > but the library isn't in your search path.
> > >
> > > It apparently defaults to enabled in 1.3.3. That seems rather
> > > silly, since I imagine the library requires RDMA hardware, which
> > > is of course not ubiquitous. Anyway, try configuring OpenMPI with
> > > --disable-openib-rdmacm and then rerunning build_visit.
> > >
> > > Of course, if you actually have an RDMA cluster, you'll want to delve
> > > deeper.
> > >
> > > Cheers,
> > >
> > > -tom
> users mailing list