Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] New address selection for btl-tcp (was Re: [OMPI svn] svn:open-mpi r17307)
From: Adrian Knoth (adi_at_[hidden])
Date: 2008-02-12 15:36:29

On Fri, Feb 01, 2008 at 11:40:20AM -0500, Tim Prins wrote:

> Adrian,


Sorry for the late reply and thanks for your testing.

> 1. There are some warnings when compiling:

I've fixed these issues.

> 2. If I exclude all my tcp interfaces, the connection fails properly,
> but I do get a malloc request for 0 bytes:
> tprins_at_odin examples]$ mpirun -mca btl tcp,self -mca btl_tcp_if_exclude
> eth0,ib0,lo -np 2 ./ring_c
> malloc debug: Request for 0 bytes (btl_tcp_component.c, 844)
> malloc debug: Request for 0 bytes (btl_tcp_component.c, 844)
> <snip>

Not my fault, but I guess we could fix it anyway. Should we?

> 3. If the exclude list does not contain 'lo', or the include list
> contains 'lo', the job hangs when using multiple nodes:

That's weird. Loopback interfaces should automatically be excluded right
from the beginning. See opal/util/if.c.

I neither know nor haven't checked where things go wrong. Do you want to
investigate? As already mentioned, this should not happen.

Can you post the output of "ip a s" or "ifconfig -a"?

> However, the great news about this patch is that it appears to fix
> for me.

It also fixes my #1206. I'd like to merge tmp-public/btl-tcp into the
trunk, especially before the 1.3 code freeze. Any objections?

Cluster and Metacomputing Working Group
Friedrich-Schiller-Universität Jena, Germany