Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] multi-rail failover with IB
From: Pavel Shamis (Pasha) (pasha_at_[hidden])
Date: 2008-04-03 03:15:24


Jeff Squyres wrote:
>> can OpenMPI also deal with one of the subnets failing?
>> ie. will OpenMPI automatically fall back to using the last remaining
>> working IB port out of a node, or even fallback to GigE if all the IB
>> fails?
>>
>
> Not in the 1.2 series.
>
> The 1.3 series *may* include "APM" support (automatic path migration
> -- a feature in IB). It looks positive that that'll make the 1.3 cut,
> but I don't have definite information yet.
>
Current ompi-trunk have APM implementation. If you enable APM ompi will
use only first port on the
HCA for data transmission and second one will be reserver for back-up.
On network failure on the first port
all connections will migrate to second port. The APM works only on the
HCA level - I mean that you can not migrate between
different HCAs, you can migrate only between 2 ports of the same HCA.

-- 
Pavel Shamis (Pasha)
Mellanox Technologies