Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] [OMPI svn] svn:open-mpi r28016 - trunk/ompi/mca/btl/tcp
From: Jeff Squyres (jsquyres) (jsquyres_at_[hidden])
Date: 2013-02-01 18:50:07


On Feb 1, 2013, at 6:28 PM, George Bosilca <bosilca_at_[hidden]> wrote:

> So far, all interfaces specified via MCA parameters for the BTL TCP
> are required to exist. Otherwise an error message is printed and an
> error returned to the upper level, with the intent that no BTLs of
> this type will be enabled (as an example btl_tcp_component.c:682).

Actually, it doesn't -- that's why I made this one match the other behavior.

For example, if I exclude an interface that doesn't exist (on v1.6 HEAD):

-----
[15:40] savbu-usnic:~/svn/ompi-1.6/examples % mpirun -np 2 --mca btl_tcp_if_exclude lo,bogus ring_c
Process 0 sending 10 to 1, tag 201 (2 processes in ring)
Process 0 sent to 1
Process 0 decremented value: 9
Process 0 decremented value: 8
Process 0 decremented value: 7
Process 0 decremented value: 6
Process 0 decremented value: 5
Process 0 decremented value: 4
Process 0 decremented value: 3
Process 0 decremented value: 2
Process 0 decremented value: 1
Process 0 decremented value: 0
Process 0 exiting
Process 1 exiting
[15:40] savbu-usnic:~/svn/ompi-1.6/examples %
-----

Or if I include an interface that doesn't exist (although this one warns):

-----
[15:40] savbu-usnic:~/svn/ompi-1.6/examples % mpirun -np 2 --mca btl_tcp_if_include eth0,bogus ring_c
[savbu-usnic][[7221,1],0][btl_tcp_component.c:682:mca_btl_tcp_component_create_instances] invalid interface "bogus"
[savbu-usnic][[7221,1],1][btl_tcp_component.c:682:mca_btl_tcp_component_create_instances] invalid interface "bogus"
Process 0 sending 10 to 1, tag 201 (2 processes in ring)
Process 0 sent to 1
Process 0 decremented value: 9
Process 0 decremented value: 8
Process 0 decremented value: 7
Process 0 decremented value: 6
Process 0 decremented value: 5
Process 0 decremented value: 4
Process 0 decremented value: 3
Process 0 decremented value: 2
Process 0 decremented value: 1
Process 0 decremented value: 0
Process 0 exiting
Process 1 exiting
[15:42] savbu-usnic:~/svn/ompi-1.6/examples %
-----

Are there other cases that I'm missing where we *do* abort?

If so, we should probably be consistent: pick one way (abort or not abort) and do that in all cases. I don't think I have much of an opinion here on which way we should go; I can see multiple arguments:

- We should abort: we have a large precedent in many other place in OMPI that if a human asks for something OMPI can't deliver, we abort and make the human figure it out.

- We should warn/not abort: this is the behavior we've had for a long time. Changing it may break backwards compatibility.

> If I correctly understand your commit, it change this [so far
> consistent] behavior for a single of our TCP MCA parameter (if_seq)
> to: print an error message and then continue. As you set
> themca_btl_tcp_component.tcp_if_seq to NULL this is as if this
> argument was never provided.
>
> I prefer the old behavior for its corrective meaning (you fix it and
> then it works), as well as for its consistency with the other BTL TCP
> parameters.
>
> George.
>
>
>
> On Fri, Feb 1, 2013 at 3:17 PM, <svn-commit-mailer_at_[hidden]> wrote:
>> Author: jsquyres (Jeff Squyres)
>> Date: 2013-02-01 15:17:43 EST (Fri, 01 Feb 2013)
>> New Revision: 28016
>> URL: https://svn.open-mpi.org/trac/ompi/changeset/28016
>>
>> Log:
>> As the help message states, it's not an ''error'' if the specified
>> interface is not found. It should just be skipped.
>>
>> Text files modified:
>> trunk/ompi/mca/btl/tcp/btl_tcp_component.c | 8 +++++---
>> 1 files changed, 5 insertions(+), 3 deletions(-)
>>
>> Modified: trunk/ompi/mca/btl/tcp/btl_tcp_component.c
>> ==============================================================================
>> --- trunk/ompi/mca/btl/tcp/btl_tcp_component.c Fri Feb 1 09:27:37 2013 (r28015)
>> +++ trunk/ompi/mca/btl/tcp/btl_tcp_component.c 2013-02-01 15:17:43 EST (Fri, 01 Feb 2013) (r28016)
>> @@ -314,10 +314,12 @@
>> ompi_process_info.nodename,
>> mca_btl_tcp_component.tcp_if_seq,
>> "Interface does not exist");
>> - return OMPI_ERR_BAD_PARAM;
>> + free(mca_btl_tcp_component.tcp_if_seq);
>> + mca_btl_tcp_component.tcp_if_seq = NULL;
>> + } else {
>> + BTL_VERBOSE(("Node rank %d using TCP interface %s",
>> + node_rank, mca_btl_tcp_component.tcp_if_seq));
>> }
>> - BTL_VERBOSE(("Node rank %d using TCP interface %s",
>> - node_rank, mca_btl_tcp_component.tcp_if_seq));
>> }
>> }
>>
>> _______________________________________________
>> svn mailing list
>> svn_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/svn
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/