Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] QLogic HCA random crash after prolonged use
From: Vanja Z (vanja_z_at_[hidden])
Date: 2013-06-15 13:44:48


>>  I have seen it recommended to use psm instead of openib for QLogic cards.

> [Tom]
> Yes.  PSM will perform better and be more stable when running OpenMPI than using
> verbs.  Intel has acquired the InfiniBand assets of QLogic about a year ago. 
> These SDR HCAs are no longer supported, but should still work.  You can get the
> driver (ib_qib) and PSM library from OFED 1.5.4.1 or the current release OFED
> 3.5.
>
> With the current OFED 3.5 release there are included psm-release notes which
> start out this way (read down to the OpenMPI build instructions for PSM):

Thanks
 for the reply (and sorry for my late response). I had already tried
compiling OpenMPI with the "--with-psm" flag. It compiles but doesn't
seem to get me much closer to actually using psm.

I've found a software package(s) available from the Intel site,
http://www.intel.com/content/www/us/en/search.html?keyword=qlogic+ofed
It
 seems like installing these on a supported OS (RHEL5/6 and SLES 10/11)
is the recommended method for using QLogic/Intel cards. I also found
this very informative post by Julian Blache explaining how he got it all
 working on Debian Squeeze,
http://swik.net/Debian/Planet+Debian/Julien+Blache%3A+QLogic+QLE73xx+InfiniBand+adapters,+QDR,+ib_qib,+OFED+1.5.2+and+Debian+Squeeze/e56if
It
 seems like apart from building OpenMPI with the right flag there is
also some configuration requiring at the very least a utility called
iba_portconfig.sh and an openibd initscript. I have tried getting these
utilities from various sources and I can't find a version that doesn't
segfault on my machines (Debian Wheezy). It's also not clear to me what
should come from the Debian repos and what should come from the Intel
package including what to do about the kernel :S

The more I read
online, the more it seems that these cards have absolutely no hope of
operating stably. With a recent kernel upgrade I'm also getting a new
MPI fork warning that some searching indicates is also connected to
QLogic cards. I bought 24 of these cards a few months ago and it has
turned into the biggest computer related nightmare I've ever
experienced. I'm beginning to think I'm better off trying to sell them
and buy an equivalent from Mellanox card (I have 2 Mellanox cards that I
 seem to work fine on Debian out of the box).

Have I got any chance of making these cards work on Debian Wheezy?