Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

From: smairal_at_[hidden]
Date: 2007-05-30 16:04:07


I use a shared memory system and for my MPI algorithm, I set the
IP-addresses for all the nodes as 127.0.0.1 in some_hostfile and I
execute the program using "mpirun --machinefile some_hostfile -np 4
prog-name". I think, by default the sm btl switch is ON. Will this help
in such a case? I am not sure but you may just give it a try, if you
haven't tried this, Bill.

-Sarang.

Quoting Brian Barrett <bbarrett_at_[hidden]>:

> Bill -
>
> This is a known issue in all released versions of Open MPI. I have a
> patch that hopefully will fix this issue in 1.2.3. It's currently
> waiting on people in the OPen MPI team to verify I didn't do
> something stupid.
>
> Brian
>
> On May 29, 2007, at 9:59 PM, Bill Saphir wrote:
>
> >
> > George,
> >
> > This is one of the things I tried, and the setting the oob
> > interface did not work,
> > with the error message below.
> >
> > Also, per this thread:
> > http://www.open-mpi.org/community/lists/users/2007/05/3319.php
> > I believe it is oob_tcp_include, not oob_tcp_if_include. The latter
> > is silently
> > ignored in 1.2, as far as I can tell.
> >
> > Interestingly, telling the MPI layer to use lo0 (or to not use tcp
> > at all) works fine.
> > But when I try to do the same for the OOB layer, it complains. The
> > full error is:
> >
> > [mymac.local:07001] [0,0,0] mca_oob_tcp_init: invalid address ''
> > returned for selected oob interfaces.
> > [mymac.local:07001] [0,0,0] ORTE_ERROR_LOG: Error in file oob_tcp.c
> > at line 1196
> >
> > mpirun actually hangs at this point and no processes are spawned. I
> > have to ^C to stop it.
> > I see this behavior on both Mac OS and on Linux with 1.2.2.
> >
> > Bill
> >
> >
> > George Bosilica wrote:
> >> There are 2 sets of sockets: one for the oob layer and one for the
> >> MPI layer (at least if TCP support is enabled). Therefore, in
> order
> >> to achieve what you're looking for you should add to the command
> line
> >> "--mca oob_tcp_if_include lo0 --mca btl_tcp_if_include lo0".
> >> On May 29, 2007, at 3:58 PM, Bill Saphir wrote:
> >>
> >
> > ----- original message below ---
> >
> >> We have run into the following problem:
> >>
> >> - start up Open MPI application on a laptop
> >> - disconnect from network
> >> - application hangs
> >>
> >> I believe that the problem is that all sockets created by Open MPI
> >> are bound to the external network interface.
> >> For example, when I start up a 2 process MPI job on my Mac (no
> >> hosts specified), I get the following tcp
> >> connections. 192.168.5.2 is an address on my LAN.
> >>
> >> tcp4 0 0 192.168.5.2.49459 192.168.5.2.49463
> >> ESTABLISHED
> >> tcp4 0 0 192.168.5.2.49463 192.168.5.2.49459
> >> ESTABLISHED
> >> tcp4 0 0 192.168.5.2.49456 192.168.5.2.49462
> >> ESTABLISHED
> >> tcp4 0 0 192.168.5.2.49462 192.168.5.2.49456
> >> ESTABLISHED
> >> tcp4 0 0 192.168.5.2.49456 192.168.5.2.49460
> >> ESTABLISHED
> >> tcp4 0 0 192.168.5.2.49460 192.168.5.2.49456
> >> ESTABLISHED
> >> tcp4 0 0 192.168.5.2.49456 192.168.5.2.49458
> >> ESTABLISHED
> >> tcp4 0 0 192.168.5.2.49458 192.168.5.2.49456
> >> ESTABLISHED
> >>
> >> Since this application is confined to a single machine, I would
> >> like it to use 127.0.0.1,
> >> which will remain available as the laptop moves around. I am
> >> unable to force it to bind
> >> sockets to this address, however.
> >>
> >> Some of the things I've tried are:
> >> - explicitly setting the hostname to 127.0.0.1 (--host 127.0.0.1)
> >> - turning off the tcp btl (--mca btl ^tcp) and other variations
> (--
> >> mca btl self,sm)
> >> - using --mca oob_tcp_include lo0
> >>
> >> The first two have no effect. The last one results in an error
> >> message of:
> >> [myhost.locall:05830] [0,0,0] mca_oob_tcp_init: invalid address ''
> >> returned for selected oob interfaces.
> >>
> >> Is there any way to force Open MPI to bind all sockets to
> 127.0.0.1?
> >>
> >> As a side question -- I'm curious what all of these tcp
> >> connections are used for. As I increase the number
> >> of processes, it looks like there are 4 sockets created per MPI
> >> process, without using the tcp btl.
> >> Perhaps stdin/out/err + control?
> >>
> >> Bill
> >>
> >>
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>