Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] QLogic HCA random crash after prolonged use
From: Dave Love (d.love_at_[hidden])
Date: 2013-04-25 12:24:34


"Elken, Tom" <tom.elken_at_[hidden]> writes:

>> > Intel has acquired the InfiniBand assets of QLogic
>> > about a year ago. These SDR HCAs are no longer supported, but should
>> > still work.
> [Tom]
> I guess the more important part of what I wrote is that " These SDR HCAs are no longer supported" :)

Sure, though from our point of view, they never were. Good riddance to
that cluster vendor, who should have gone out of business earlier.

> [Tom]
> Some testing from an Intel group who had these QLE7140 HCAs revealed to me that they do _not_ work with our recent software stack such as IFS 7.1.1 (which includes OFED 1.5.4.1) .

I suspect I had done the experiment too.

> They were able to get them to work with the QLogic OFED+ 6.0.2 stack.
> That corresponds to OFED 1.5.2 -- that was the first OFED to include
> PSM.

In case it helps for anyone else trying this with old kit: I had been
using a v6.something, but I'd have to check the something. Using the
set of "updates" modules built with that and the latest kernel also
provokes the crashes, binary compatibility or not.

> I am providing this info as a courtesy, but not making any guarantees
> that it will work.

Understood, and thanks.

> [Tom]
> The older QLogic and OFED stacks mentioned above were not ported to nor tested with RHEL 5.9, which did not exist at the time. Sorry.

Sure, and presumably the Red Hat module shouldn't match the hardware if
it won't work. (The kernel supports the even older QHT cards OK -- pity
anyone running an old cluster with Mellanox added to three incompatible
lots of Infinipath and ethernet islands.)