It is worth clarifying a point in this discussion that I neglected to
mention in my initial post: although Open MPI may not work *by
default* with heterogeneous HCAs/RNICs, it is quite possible/likely
that if you manually configure Open MPI to use the same verbs/hardware
settings across all your HCAs/RNICs (assuming that you use a set of
values that is compatible with all your hardware) that MPI jobs
spanning multiple different kinds of HCAs or RNICs will work fine.
See this post on the devel list for a few more details:
On Jan 27, 2009, at 6:08 AM, Peter Kjellstrom wrote:
> On Monday 26 January 2009, Jeff Squyres wrote:
>> The Interop Working Group (IWG) of the OpenFabrics Alliance asked me
>> to bring a question to the Open MPI user and developer communities:
>> anyone interested in having a single MPI job span HCAs or RNICs from
>> multiple vendors? (pardon the cross-posting, but I did want to ask
>> each group separately -- because the answers may be different)
>> The interop testing lab at the University of New Hampshire
>> (http://www.iol.unh.edu/services/testing/ofa/ ) discovered that
>> most (all?)
>> MPI implementations fail when having a single MPI job span HCAs from
>> multiple vendors and/or span RNICs from multiple vendors. I don't
>> the exact details (and they may not be public, anyway), but I'm
>> pretty sure
>> that OMPI failed when used with QLogic and Mellanox HCAs in a
>> single MPI
>> job. This is fairly unsurprising, given how we tune Open MPI's use
>> OpenFabrics-capable hardware based on our .ini file.
>> So my question is: does anyone want/need to support jobs that span
>> HCAs from multiple vendors and/or RNICs from multiple vendors?
> For these three cases:
> 1) Different vedor id but same OFED driver and basic chip
> 2) Same chip vendor, different OFED driver (mthca vs mlx4)
> 3) Any OFED supported IB HCA
> Number one should just work. We may at times have some nodes with
> HCAs that
> have been flashed with non-standard/non-vendor firmware.
> Number two is something I would kind of expect to work. A possible
> where I'd need it is if I temporarily use an older HCA (mthca) to
> get a node
> going on a cluster with ConnectX (mlx4). Another case could be a
> cluster with
> two partitions with different HCAs.
> Number three would be nice to have. I think many users would assume
> it to
> work. Why not? They have symmetric software, all nodes run OFED, all
> working IB... It would have worked if their nodes had had different
> kinds of
> ethernet NICS...
> users mailing list