On Fri, Feb 01, 2008 at 11:40:20AM -0500, Tim Prins wrote:
> Adrian,
Hi!
Sorry for the late reply and thanks for your testing.
> 1. There are some warnings when compiling:
I've fixed these issues.
> 2. If I exclude all my tcp interfaces, the connection fails properly,
> but I do get a malloc request for 0 bytes:
> tprins_at_odin examples]$ mpirun -mca btl tcp,self -mca btl_tcp_if_exclude
> eth0,ib0,lo -np 2 ./ring_c
> malloc debug: Request for 0 bytes (btl_tcp_component.c, 844)
> malloc debug: Request for 0 bytes (btl_tcp_component.c, 844)
> <snip>
Not my fault, but I guess we could fix it anyway. Should we?
> 3. If the exclude list does not contain 'lo', or the include list
> contains 'lo', the job hangs when using multiple nodes:
That's weird. Loopback interfaces should automatically be excluded right
from the beginning. See opal/util/if.c.
I neither know nor haven't checked where things go wrong. Do you want to
investigate? As already mentioned, this should not happen.
Can you post the output of "ip a s" or "ifconfig -a"?
> However, the great news about this patch is that it appears to fix
> https://svn.open-mpi.org/trac/ompi/ticket/1027 for me.
It also fixes my #1206. I'd like to merge tmp-public/btl-tcp into the
trunk, especially before the 1.3 code freeze. Any objections?
--
Cluster and Metacomputing Working Group
Friedrich-Schiller-Universität Jena, Germany
private: http://adi.thur.de
|