I've tried to launch the application on nodes with QDR Infiniband. The first attempt with 2 processes worked, but the following was printed to the output:
Hello!I've built openmpi 1.6.1rc3 with support of MXM. But when I try to launch an application using this mtl it hangs and can't figure out why.If I launch it with np below 128 then everything works fine since mxm isn't used. I've tried setting the threshold to 0 and launching 2 processes with the same result: hangs on startup.What could be causing this problem?Here is the command I execute:/opt/openmpi/1.6.1/mxm-test/bin/mpirun \-np $NP \-hostfile hosts_fdr2 \--mca mtl mxm \--mca btl ^tcp \--mca mtl_mxm_np 0 \-x OMP_NUM_THREADS=$NT \-x LD_LIBRARY_PATH \--bind-to-core \-npernode 16 \--mca coll_fca_np 0 -mca coll_fca_enable 0 \./IMB-MPI1 -npmin $NP Allreduce Reduce Barrier Bcast Allgather AllgathervI'm performing the tests on nodes with Intel SB processors and FDR. Openmpi was configured with the following parameters:CC=icc CXX=icpc F77=ifort FC=ifort ./configure --prefix=/opt/openmpi/1.6.1rc3/mxm-test --with-mxm=/opt/mellanox/mxm --with-fca=/opt/mellanox/fca --with-knem=/usr/share/knemI'm using the latest ofed from mellanox: 1.5.3-3.1.0 on centos 6.1 with default kernel: 2.6.32-131.0.15.The compilation with default mxm (1.0.601) failed so I installed the latest version from mellanox: 1.1.1227Best regards, Pavel Mezentsev.