Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] InfiniBand, different OpenFabrics transport types
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2011-07-07 20:14:01

On Jun 28, 2011, at 1:46 PM, Bill Johnstone wrote:

> I have a heterogeneous network of InfiniBand-equipped hosts which are all connected to the same backbone switch, an older SDR 10 Gb/s unit.
> One set of nodes uses the Mellanox "ib_mthca" driver, while the other uses the "mlx4" driver.
> This is on Linux 2.6.32, with Open MPI 1.5.3 .
> When I run Open MPI across these node types, I get an error message of the form:
> Open MPI detected two different OpenFabrics transport types in the same Infiniband network.
> Such mixed network trasport configuration is not supported by Open MPI.
> Local host: compute-chassis-1-node-01
> Local adapter: mthca0 (vendor 0x5ad, part ID 25208)

Wow, that's cool ("UNKNOWN"). Are you using an old version of OFED or something?

Mellanox -- how can this happen?

> Remote host: compute-chassis-3-node-01
> Remote Adapter: (vendor 0x2c9, part ID 26428)
> Remote transport type: MCA_BTL_OPENIB_TRANSPORT_IB
> Two questions:
> 1. Why is this occurring if both adapters have all the OpenIB software set up? Is it because Open MPI is trying to use functionality such as ConnectX with the newer hardware, which is incompatible with older hardware, or is it something more mundane?

It's basically a mismatch of IB capabilities -- Open MPI is trying to use more advanced features in some nodes and not in others.

> 2. How can I use IB amongst these heterogeneous nodes?

Mellanox will need to answer this question... It might be able to be done, but I don't know how offhand. The first issue is to figure out why you're getting TRANSPORT_UNKNOWN on the one node.

Jeff Squyres
For corporate legal information go to: