Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI devel] [EXTERNAL] Re: [OMPI svn] svn:open-mpi r28016 - trunk/ompi/mca/btl/tcp
From: Barrett, Brian W (bwbarre_at_[hidden])
Date: 2013-02-01 21:59:18


I don't think this is right either. Excluding a device that doesn't exist has many use cases. Such as disabling a network that only exists on part of the cluster. I'm not sure about what to do with seq; it's more like include than exclude.

Brian

Sent with Good (www.good.com)

 -----Original Message-----
From: Jeff Squyres (jsquyres) [mailto:jsquyres_at_[hidden]]
Sent: Friday, February 01, 2013 06:05 PM Mountain Standard Time
To: Open MPI Developers
Subject: [EXTERNAL] Re: [OMPI devel] [OMPI svn] svn:open-mpi r28016 - trunk/ompi/mca/btl/tcp

On Feb 1, 2013, at 7:09 PM, George Bosilca <bosilca_at_[hidden]> wrote:

> I did not say we abort, I say we prevent BTL TCP from being used.

Ah.

> In your example, I guess the TCP is disabled but the PML finds another
> available interface and keeps going. If I try the same thing with
> "--mca btl tcp,self" it does abort on my cluster.
>
> ---
> mpirun -np 2 --mca btl tcp,self --mca btl_tcp_if_include eth3 ./ring_c
> [dancer02][[48001,1],1][../../../../../ompi/ompi/mca/btl/tcp/btl_tcp_component.c:682:mca_btl_tcp_component_create_instances]
> invalid interface "eth3"

Good point.

But it looks like that behavior doesn't occur for btl_tcp_if_exclude:

------
[16:57] savbu-usnic:~/svn/ompi-1.6/examples % mpirun --host node001,node002 --mca btl tcp,self --mca btl_tcp_if_exclude lo,bogus ring_c
Process 0 sending 10 to 1, tag 201 (2 processes in ring)
Process 0 sent to 1
Process 0 decremented value: 9
Process 0 decremented value: 8
Process 0 decremented value: 7
Process 0 decremented value: 6
Process 0 decremented value: 5
Process 0 decremented value: 4
Process 0 decremented value: 3
Process 0 decremented value: 2
Process 0 decremented value: 1
Process 0 decremented value: 0
Process 0 exiting
Process 1 exiting
[16:58] savbu-usnic:~/svn/ompi-1.6/examples %
------

So it sounds like I should:

1. put if_seq back the way it was
2. fix the if_seq show_help message to say that TCP won't be used (right now it just says that the value will be ignored -- which is one of the other reasons I changed the behavior to ignore the value)
3. make btl_tcp_if_exclude exhibit the same behavior (if a bogus interface is specified, disable TCP)
4. make all error cases nice-nice with show_help instead of BTL_VERBOSE :-)

Agree?

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
_______________________________________________
devel mailing list
devel_at_[hidden]
http://www.open-mpi.org/mailman/listinfo.cgi/devel