Thanks for your responses.
We found the problem. Issue was that the librdmacm-devel rpm was not installed on the build system.
Installed the rpm and re-built OpenMPI. Now RoCE works fine.
You might the requirement for the librdmacm-devel rpm to the install readme.
> -----Original Message-----
> From: Jeff Squyres [mailto:jsquyres_at_[hidden]]
> Sent: Wednesday, October 05, 2011 9:15 AM
> To: kliteyn_at_[hidden]; Open MPI Users
> Cc: Konz, Jeffrey (SSA Solution Centers)
> Subject: Re: [OMPI users] problem running with RoCE over 10GbE
> On Oct 5, 2011, at 9:04 AM, Yevgeny Kliteynik wrote:
> >> Built OpenMPI with this option "--enable-openib-rdmacm".
> >> Our system has OFED 1.5.2 with librdmacm-1.0.13-1
> >> I noticed this output from configure script:
> >> checking rdma/rdma_cma.h usability... no
> >> checking rdma/rdma_cma.h presence... no
> >> checking for rdma/rdma_cma.h... no
> >> checking whether IBV_LINK_LAYER_ETHERNET is declared... yes
> >> checking if RDMAoE support is enabled... yes
> >> checking for infiniband/driver.h... yes
> >> checking if ConnectX XRC support is enabled... yes
> >> checking if dynamic SL is enabled... no
> >> checking if OpenFabrics RDMACM support is enabled... no
> >> Are we missing a build option or a piece of software?
> >> Config.log and output from "ompi_info --all" attached.
> > You shouldn't use the "--enable-openib-rdmacm" option - rdmacm
> > support is enabled by default, providing librdmacm is found on
> > the machine.
> Actually, this might be a configure bug. We have lots of other
> configure options that, even if "foo" support is optional, if you
> specify "--with-foo", then OMPI treats it as mandatory. Specifically,
> if foo can't be found, it's an error and configure should abort (i.e.,
> let a human figure it out).
> Yevgeny -- can you check that out?
> > So the question is, why OMPI config script didn't find it?
> > OMPI looks for "rdma/rdma_cma.h" header. Do you have it on
> > you build machine?
> > The usual location of this file is /usr/include/rdma/rdma_cma.h
> Here's the culprit in config.log:
> configure:118771: checking rdma/rdma_cma.h presence
> configure:118771: gcc -E conftest.c
> conftest.c:573:27: error: rdma/rdma_cma.h: No such file or directory
> configure:118771: $? = 1
> I'd double check that that file is actually present on your system. I
> don't think <> vs. "" will make a difference, though.
> Jeff Squyres
> For corporate legal information go to: