Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [EXTERNAL] Re: [OMPI svn] svn:open-mpi r28016 - trunk/ompi/mca/btl/tcp
From: Barrett, Brian W (bwbarre_at_[hidden])
Date: 2013-02-01 21:59:18


I don't think this is right either. Excluding a device that doesn't exist has many use cases. Such as disabling a network that only exists on part of the cluster. I'm not sure about what to do with seq; it's more like include than exclude.

Brian

Sent with Good (www.good.com)

 -----Original Message-----
From: Jeff Squyres (jsquyres) [mailto:jsquyres_at_[hidden]]
Sent: Friday, February 01, 2013 06:05 PM Mountain Standard Time
To: Open MPI Developers
Subject: [EXTERNAL] Re: [OMPI devel] [OMPI svn] svn:open-mpi r28016 - trunk/ompi/mca/btl/tcp

On Feb 1, 2013, at 7:09 PM, George Bosilca <bosilca_at_[hidden]> wrote:

> I did not say we abort, I say we prevent BTL TCP from being used.

Ah.

> In your example, I guess the TCP is disabled but the PML finds another
> available interface and keeps going. If I try the same thing with
> "--mca btl tcp,self" it does abort on my cluster.
>
> ---
> mpirun -np 2 --mca btl tcp,self --mca btl_tcp_if_include eth3 ./ring_c
> [dancer02][[48001,1],1][../../../../../ompi/ompi/mca/btl/tcp/btl_tcp_component.c:682:mca_btl_tcp_component_create_instances]
> invalid interface "eth3"

Good point.

But it looks like that behavior doesn't occur for btl_tcp_if_exclude:

------
[16:57] savbu-usnic:~/svn/ompi-1.6/examples % mpirun --host node001,node002 --mca btl tcp,self --mca btl_tcp_if_exclude lo,bogus ring_c
Process 0 sending 10 to 1, tag 201 (2 processes in ring)
Process 0 sent to 1
Process 0 decremented value: 9
Process 0 decremented value: 8
Process 0 decremented value: 7
Process 0 decremented value: 6
Process 0 decremented value: 5
Process 0 decremented value: 4
Process 0 decremented value: 3
Process 0 decremented value: 2
Process 0 decremented value: 1
Process 0 decremented value: 0
Process 0 exiting
Process 1 exiting
[16:58] savbu-usnic:~/svn/ompi-1.6/examples %
------

So it sounds like I should:

1. put if_seq back the way it was
2. fix the if_seq show_help message to say that TCP won't be used (right now it just says that the value will be ignored -- which is one of the other reasons I changed the behavior to ignore the value)
3. make btl_tcp_if_exclude exhibit the same behavior (if a bogus interface is specified, disable TCP)
4. make all error cases nice-nice with show_help instead of BTL_VERBOSE :-)

Agree?

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
_______________________________________________
devel mailing list
devel_at_[hidden]
http://www.open-mpi.org/mailman/listinfo.cgi/devel