Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Question about '--mca btl tcp,self'
From: Jianyu Liu (jerry_leo_at_[hidden])
Date: 2014-03-16 03:16:41


Thanks for your kindly input

More further questions

1. How to check and change the ordering of network interfaces, such as tcp, ib, etc., in the kernel?

2. One of my applications only can run with "--mca btl tcp,self", otherwise aborted without specific error messages, even if run on single node. How to figure out the possible reasons?

Appreciating your kindly input

Jianyu

> From: rhc_at_[hidden]
> Date: Sat, 15 Mar 2014 07:21:31 -0700
> To: users_at_[hidden]
> Subject: Re: [OMPI users] Question about '--mca btl tcp,self'
>
>
> On Mar 14, 2014, at 10:18 PM, Jianyu Liu <jerry_leo_at_[hidden]> wrote:
>
> >> On Mar 14, 2014, at 10:16:34 AM,Jeff Squyres <jsquyres_at_[hidden]> wrote:
> >>
> >>> On Mar 14, 2014, at 10:11 AM, Ralph Castain <rhc_at_[hidden]> wrote:
> >>>
> >>>> 1. If specified '--mca btl tcp,self', which interface application will run on, use GigE adaper OR use the OpenFabrics interface in IP over IB mode (just like a high performance GigE adapter) ?
> >>>
> >>> Both - ip over ib looks just like an Ethernet adaptor
> >>
> >>
> >> To be clear: the TCP BTL will use all TCP interfaces (regardless of underlying physical transport). Your GigE adapter and your IP adapter both present IP interfaces to>the OS, and both support TCP. So the TCP BTL will use them, because it just sees the TCP/IP interfaces.
> >
> > Thanks for your kindly input.
> >
> > Please see if I have understood correctly
> >
> > Assume there are two nework
> > Gigabit Ethernet
> >
> > eth0-renamed : 192.168.[1-22].[1-14] / 255.255.192.0
> >
> > InfiniBand network
> >
> > ib0 : 172.20.[1-22].[1-4] / 255.255.0.0
> >
> >
> > 1. If specified '--mca btl tcp,self
> >
> > The control information ( such as setup and teardown ) are routed to and passed by Gigabit Ethernet in TCP/IP mode
>
> Not necessarily - the out-of-band (OOB) system will pickup one of the TCP interfaces, but which one depends on the ordering in the kernel.
>
> > The MPI messages are routed to and passed by InfiniBand network in IP over IB mode
>
> Not necessarily - could use either device
>
> > On the same machine, the TCP lookback device will be used for passing control and MPI messages
>
> I believe the TCP BTL would use the selected device for loopback, ignoring the loopback device
>
> >
> > 2. If specified '--mca btl tcp,self --mca btl_tcp_if_include ib0'
> >
> > Both of control information ( such as setup and teardown ) and MPI messages are routed to and passed by InfiniBand network in IP over IB mode
>
> No - control info is sent by the OOB, not the BTL. To get what you describe, you would have to add "-mca oob_tcp_if_include ib0"
>
> > On the same machine, The TCP lookback device will be used for passing control and MPI messages
>
> No - the TCP MPI messages would loopback via the ib0 device
>
> >
> >
> > 3. If specified '--mca btl openib,self'
> >
> > The control information ( such as setup and teardown ) are routed to and passed by InfiniBand network in IP over IB mode
>
> Not necessarily - same answer as #1
>
> > The MPI messages are routed to and passed by InfiniBand network in RDMA mode
>
> Well, it will use IB, but may not use RDMA. That is an internal decision tree made per-message based on a variety of factors
>
> > On the same machine, the TCP lookback device will be used for passing control and MPI messages
>
> No - you excluded TCP for MPI messages, and so it would have to loopback within the IB stack. Control messages would loopback via TCP
>
> >
> >
> > 4. If without specifiying any 'mca btl' parameters
> >
> > The control information ( such as setup and teardown ) are routed to and passed by Gigabit Ethernet in TCP/IP mode
>
> Not necessarily - same answer as #1
>
> > The MPI messages are routed and passed by InfiniBand network in RDMA mode
>
> Same as #3
>
> > On the same machine, the shared memory (sm) BTL will be used for control and MPI passing messages
>
> Not for control - just for MPI
>
> >
> >
> > Appreciating your kindly input
> >
> > Jianyu
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users