I will check the code of OB1 more carefully. Thanks.
> Date: Thu, 25 Oct 2012 10:55:51 -0700
> From: Ralph Castain <rhc_at_[hidden]>
> Subject: Re: [OMPI devel] NIC Failover and Message Stripping of Open
> To: Open MPI Developers <devel_at_[hidden]>
> Message-ID: <B1A13D1B-02A2-4E67-B0CD-FA924538D458_at_[hidden]>
> Content-Type: text/plain; charset="us-ascii"
> Just an FYI - I asked a similar question recently and got the following
> answer from Rolf:
> > In my case, it was specific to openib only and it required you to be
> running with two or more IB rails.
> > Then, if one of them failed, we just shut it down, and continued with
> the working ones.
> > You could only get use of the failing rail if it was fixed and a new job
> was started.
> > To get this to work, I created a new PML called bfo. I also had to make
> some changes in the openib BTL.
> > By default, none of the code is configured in. There is a README in the
> PML bfo directory that
> > actually does quite a good job explaining what I did.
> The bfo module is included in the 1.6 series, and in the upcoming 1.7
> series. Can't say anything as to its state of repair.
> On Oct 25, 2012, at 10:41 AM, George Bosilca <bosilca_at_[hidden]> wrote:
> > On Oct 25, 2012, at 17:54 , Lirong Jian <lirong.misc_at_[hidden]> wrote:
> >> Hi foks,
> >> Sorry to bother you guys, but I have some questions about Open MPI and
> really want your help.
> >> There are some papers (e.g., [1, 2, 3], although they are sort of
> old-aged) mentioning that Open MPI is supporting NIC failover and message
> stripping over multiple NICs. However, when I read the source code of
> openmpi-1.6.2, I couldn't find any component named DR or TEG (which are
> mentioned in those papers and are supposed to support NIC failover and
> message stripping). So my question is:
> >> Does the 1.6.2 release of Open MPI support such two kinds of
> functionalities? If positive, which part of code is corresponding to these
> > Lirong,
> > As you noticed the papers are quite old and dusty.
> > Due to a lack of interest from the community the DR PML has been retired
> from out stable releases. In other terms no stable Open MPI version
> supports network failover. However, the code is still available in the
> trunk, but there is no guarantee it still does what it was designed for.
> > TEG has been replaced with OB1, which is our current network management
> layer. It does stripping over multiple NICs (identical or not) by default.
> > george.
> >> Many thanks in advance.
> >> P.S., I am a newbie of this domain. Maybe my questions are simple even
> naive, but your help would be highly appreciated.
> >> Best,
> >> Lirong
> >>  Network Fault Tolerance in Open MPI.
> >>  Open MPI: A High Performance, Flexible Implementation of MPI
> Point-to-Point Communications.
> >>  TEG: A High-Performance, Scalable, Multi-network, Point-to-Point,
> Communications Methodology.
> >> _______________________________________________
> >> devel mailing list
> >> devel_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> -------------- next part --------------
> HTML attachment scrubbed and removed
> devel mailing list
> End of devel Digest, Vol 2285, Issue 2