I contacted Mellanox and there is a problem with version 1.1.3a5e745 rpm.

 

Download the latest version 1.1.ad085ef from

  http://mellanox.com/downloads/hpc/mxm/v1.1/mxm-latest.tar

It builds fine with openmpi-1.6.3.

 

-Jeff

 

 

/**********************************************************/
/* Jeff Konz                          jeffrey.konz@hp.com */
/* Solutions Architect                   HPC Benchmarking */
/* Americas Shared Solutions Architecture (SSA)           */
/* Hewlett-Packard Company                                */
/* Office: 248-491-7480              Mobile: 248-345-6857 */
/**********************************************************/
 

From: Joseph Farran [mailto:jfarran@uci.edu]
Sent: Sunday, December 02, 2012 3:39 AM
To: Mike Dubman
Cc: Open MPI Users; Konz, Jeffrey (SSA Solution Centers)
Subject: Re: [OMPI users] OpenMPI-1.6.3 & MXM

 

Hi again.

I believe I have the latest mxm:

# rpm -qa| fgrep mxm
mxm-1.1.3a5e745-1.x86_64

Let me know if I have the config part correct from previous email.

Best,
Joseph


On 12/1/2012 11:44 PM, Mike Dubman wrote:

Hi,

 

The mxm which is part of MOFED 1.5.3 supports OMPI 1.6.0.

 

The mxm upgrade is needed to work with OMPI 1.6.3+

 

Please remove mxm from your cluster nodes (rpm -e mxm)

Install latest from  http://mellanox/com/products/mxm/

Compile ompi 1.6.3, add following to its configure line: ./configure --with-openib=/usr --with-mxm=/opt/mellanox/mxm <...>)

 

Regards

M

On Sat, Dec 1, 2012 at 2:23 AM, Joseph Farran <jfarran@uci.edu> wrote:

Konz,

For whatever it is worth, I am in the same boat.

I have CentOS 6.3, trying to compile OpenMPI 1.6.3 with the mxm from Mellanox and it fails.

Also, the Mellanox OFED ( MLNX_OFED_LINUX-1.5.3-3.1.0-rhel6.3-x86_64 ) does not work either.

Mellanox really needs to step in here and help out.   

I have a cluster full of Mellanox products and I hate to think we chose the wrong Infiniband vendor.

Joseph




On 11/30/2012 12:33 PM, Konz, Jeffrey (SSA Solution Centers) wrote:

I tried building the latest OpenMPI-1.6.3 with MXM support and got this error:

 

make[2]: Entering directory `Src/openmpi-1.6.3/ompi/mca/mtl/mxm'

  CC     mtl_mxm.lo

  CC     mtl_mxm_cancel.lo

  CC     mtl_mxm_component.lo

  CC     mtl_mxm_endpoint.lo

  CC     mtl_mxm_probe.lo

  CC     mtl_mxm_recv.lo

  CC     mtl_mxm_send.lo

mtl_mxm_send.c: In function 'ompi_mtl_mxm_send':

mtl_mxm_send.c:96: error: 'mxm_wait_t' undeclared (first use in this function)

mtl_mxm_send.c:96: error: (Each undeclared identifier is reported only once

mtl_mxm_send.c:96: error: for each function it appears in.)

mtl_mxm_send.c:96: error: expected ';' before 'wait'

mtl_mxm_send.c:104: error: 'MXM_REQ_FLAG_BLOCKING' undeclared (first use in this function)

mtl_mxm_send.c:118: error: 'MXM_REQ_FLAG_SEND_SYNC' undeclared (first use in this function)

mtl_mxm_send.c:134: error: 'wait' undeclared (first use in this function)

mtl_mxm_send.c: In function 'ompi_mtl_mxm_isend':

mtl_mxm_send.c:183: error: 'MXM_REQ_FLAG_SEND_SYNC' undeclared (first use in this function)

make[2]: *** [mtl_mxm_send.lo] Error 1

 

 

Our OFED is 1.5.3 and our MXM version is 1.0.601.

 

Thanks,

 

-Jeff

 

/**********************************************************/

/* Jeff Konz                          jeffrey.konz@hp.com */

/* Solutions Architect                   HPC Benchmarking */

/* Americas Shared Solutions Architecture (SSA)           */

/* Hewlett-Packard Company                                */

/* Office: 248-491-7480              Mobile: 248-345-6857 */

/**********************************************************/



_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

 


_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users