Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] CentOS 6.3 & OpenMPI 1.6.3
From: Joseph Farran (jfarran_at_[hidden])
Date: 2012-12-02 15:04:09


Hi again.

Had to get some sleep :-)

Same thing. Let me outline the steps I took in case I missed something.

I have a stock CentOS 6.3 with kernel 2.6.32-279.14.1.el6.x86_64

Install OFED-1.5.4.1 as follows:
cd OFED-1.5.4.1
./install.pl --all --print-available
grep -v debuginfo ofed-all.conf > ofed.conf
./install.pl -c ofed.conf

After a while, OFED 1.5.4.1 says it installed successfully. I reboot and commands like ibhost, etc work.

I now install mxm amd fca as follows ( using your new mxm ):

# rpm -e mxm <--- To make sure.
# cd /tmp
# rpm -i /tmp/mxm/v1.1/per-ofed/1.5.3-3.1.0/mxm-1.1.3a5e745-1.x86_64-rhel6u3.rpm
# rpm -qa | grep mxm
mxm-1.1.3a5e745-1.x86_64

Now I try compiling OpenMPI 1.6.3 with the config:

     CFLAGS="" FCFLAGS="" ./configure \
     --with-sge \
     --with-openib=/usr \
     --enable-openib-connectx-xrc \
     --enable-mpi-thread-multiple \
     --with-threads \
     --with-hwloc \
     --enable-heterogeneous \
     --with-fca=/opt/mellanox/fca \
     --with-mxm-libdir=/opt/mellanox/mxm/lib \
     --with-mxm=/opt/mellanox/mxm \
     --prefix=/data/openmpi-1-6.3

And it again fails with the new 1.5.3-3.1.0

make[2]: Entering directory `/data/apps/sources/openmpi-1.6.3/ompi/mca/mtl/mxm'
   CC mtl_mxm.lo
   CC mtl_mxm_component.lo
   CC mtl_mxm_endpoint.lo
   CC mtl_mxm_recv.lo

   CC mtl_mxm_send.lo
   CCLD mca_mtl_mxm.la
/bin/grep: /usr/local/mofed-inst/1.5.3-3.1.0/lib/librdmacm.la: No such file or directory
/bin/sed: can't read /usr/local/mofed-inst/1.5.3-3.1.0/lib/librdmacm.la: No such file or directory
libtool: link: `/usr/local/mofed-inst/1.5.3-3.1.0/lib/librdmacm.la' is not a valid libtool archive
make[2]: *** [mca_mtl_mxm.la] Error 1
make[2]: Leaving directory `/data/apps/sources/openmpi-1.6.3/ompi/mca/mtl/mxm'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/data/apps/sources/openmpi-1.6.3/ompi'
make: *** [all-recursive] Error 1

Note: I don't see a /usr/local/mofed-inst

# ls /usr/local
bin etc include lib lib64 libexec sbin share src

Question. When I built the OFED 1.5.4.1 above, I skipped the debug packages ( grep -v debuginfo ofed-all.conf > ofed.conf ). I don't think I need them?

Any other suggestions?

On 12/2/2012 2:56 AM, Mike Dubman wrote:
> please redownload from http://mellanox.com/downloads/hpc/mxm/v1.1/mxm-latest.tar
> it contains binaries compiled with mofed 1.5.3-3.1.0
> M