Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Bug in oob_tcp_[in|ex]clude?
From: Marco Sbrighi (m.sbrighi_at_[hidden])
Date: 2007-12-18 11:12:06


On Mon, 2007-12-17 at 17:19 -0500, Jeff Squyres wrote:
> On Dec 17, 2007, at 8:35 AM, Marco Sbrighi wrote:
>
> > I'm using Open MPI 1.2.2 over OFED 1.2 on an 256 nodes, dual Opteron,
> > dual core, Linux cluster. Of course, with Infiniband 4x interconnect.
> >
> > Each cluster node is equipped with 4 (or more) ethernet interface,
> > namely 2 gigabit ones plus 2 IPoIB. The two gig are named eth0,eth1,
> > while the two IPoIB are named ib0,ib1.
> >
> > It happens that the eth0 is a management network, with poor
> > performances, and furthermore we wouldn't use the ib* to carry MPI's
> > traffic (neither OOB or TCP), so we would like the eth1 is used for
> > open
> > MPI OOB and TCP.
> >
> > In order to drive the OOB over only eth1 I've tried various
> > combinations
> > of oob_tcp_[ex|in]clude MCA statements, starting from the obvious
> >
> > oob_tcp_exclude = lo,eth0,ib0,ib1
> >
> > then trying the othe obvious:
> >
> > oob_tcp_include = eth1
>
> This one statement (_include) should be sufficient.

I agree with your interpretation, but what I'm experimenting here is "it
should" but in fact it doesn't .....

>
> Assumedly this(these) statement(s) are in a config file that is being
> read by Open MPI, such as $HOME/.openmpi/mca-params.conf?

I've tried many combinations: only in $HOME/.openmpi/mca-params.conf,
only in command line and both; but none seems to work correctly.
Nevertheless, what I'm expecting is that if something is specified in
$HOME/.openmpi/mca-params.conf, then if differently specified in command
line, the last should be assumed, I think.
>
> > and both at the same time.
> >
> > Next I've tried the following:
> >
> > oob_tcp_exclude = eth0
> >
> > but after the job starts, I still have a lot of tcp connections
> > established using eth0 or ib0 or ib1.
> > Furthermore It happens the following error:
> >
> > [node191:03976] [0,1,14]-[0,1,12] mca_oob_tcp_peer_complete_connect:
> > connection failed: Connection timed out (110) - retrying
>
> This is quite odd. :-(
>
> > I've found only a way in order to have tcp connections binded only to
> > the eth1 interface, using both the following MCA directives in the
> > command line:
> >
> > mpirun .... --mca oob_tcp_include eth1 --mca oob_tcp_include
> > lo,eth0,ib0,ib1 .....
> >
> > This sounds me as bug.
>
> Yes, it does. Specifying the MCA same param twice on the command line
> results in undefined behavior -- it will only take one of them, and I
> assume it'll take the first (but I'd have to check the code to be sure).

OK, I can obtain the same behaviour using only one statement:
--mca oob_tcp_include eth1,lo,eth0,ib0,ib1

note that using --mca mpi_show_mca_params what I'm seeing in the report
is the same for both statements (twice and single):

.....
 [node255:30188] oob_tcp_debug=0
[node255:30188] oob_tcp_include=eth1,lo,eth0,ib0,ib1
[node255:30188] oob_tcp_exclude=
.......

>
> > Is there someone able to reproduce this behaviour?
> > If this is a bug, are there fixes?
>
>
> I'm unfortunately unable to reproduce this behavior. I have a test
> cluster with 2 IP interfaces: ib0, eth0. I have tried several
> combinations of MCA params with 1.2.2:
>
> --mca oob_tcp_include ib0
> --mca oob_tcp_include ib0,bogus
> --mca oob_tcp_include eth0
> --mca oob_tcp_include eth0,bogus
> --mca oob_tcp_exclude ib0
> --mca oob_tcp_exclude ib0,bogus
> --mca oob_tcp_exclude eth0
> --mca oob_tcp_exclude eth0,bogus
>
> All do as they are supposed to -- including or excluding ib0 or eth0.
>
> I do note, however, that the handling of these parameters changed in
> 1.2.3 -- as well as their names. The names changed to
> "oob_tcp_if_include" and "oob_tcp_if_exclude" to match other MCA
> parameter name conventions from other components.
>
> Could you try with 1.2.3 or 1.2.4 (1.2.4 is the most recent; 1.2.5 is
> due out "soon" -- it *may* get out before the holiday break, but no
> promises...)?

we have 1.2.3 in another cluster and it performs the same behaviour as
1.2.2 .... (BTW the other cluster has the same eth ifaces)

>
> If you can't upgrade, let me know and I can provide a debugging patch
> that will give us a little more insight into what is happening on your
> machines. Thanks.

It is quite difficult for us to upgrade the open-mpi now. We have the
official CISCO packages installed, and I know the 1.2.2-1 is the only
official CISCO's open-mpi distribution today ....

In any case I would like to try your debug patch.

Thanks

Marco

>

-- 
-----------------------------------------------------------------
 Marco Sbrighi  m.sbrighi_at_[hidden]
 HPC Group
 CINECA Interuniversity Computing Centre
 via Magnanelli, 6/3
 40033 Casalecchio di Reno (Bo) ITALY
 tel. 051 6171516