Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Need help running jobs across different IB vendors
From: Dave Love (d.love_at_[hidden])
Date: 2013-10-15 10:15:43

"Kevin M. Hildebrand" <kevin_at_[hidden]> writes:

> Hi, I'm trying to run an OpenMPI 1.6.5 job across a set of nodes, some
> with Mellanox cards and some with Qlogic cards.

Maybe you shouldn't... (I'm blessed in one cluster with three somewhat
incompatible types of QLogic card and a set of Mellanox ones, but
they're in separate islands, apart from the two different SDR ones.)

> I'm getting errors indicating "At least one pair of MPI processes are unable to reach each other for MPI communications". As far as I can tell all of the nodes are properly configured and able to reach each other, via IP and non-IP connections.
> I've also discovered that even if I turn off the IB transport via "--mca btl tcp,self" I'm still getting the same issue.
> The test works fine if I run it confined to hosts with identical IB cards.
> I'd appreciate some assistance in figuring out what I'm doing wrong.

I assume the QLogic cards are using PSM. You'd need to force them to
use openib with something like --mca mtl ^psm and make sure they have
the ipathverbs library available. You probably won't like the resulting
performance -- users here noticed when one set fell back to openib from
psm recently.