Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Open-MPI between Mac and Linux (ubuntu 9.04) over wireless
From: Pallab Datta (datta_at_[hidden])
Date: 2009-09-22 16:51:24


Hi Rolf,

I ran the following:

pallabdatta$ /usr/local/bin/mpirun --mca btl_tcp_port_min_v4 36900 -mca
btl_tcp_port_range_v4 32 --mca btl_base_verbose 30 --mca
btl_tcp_if_include en0,wlan0 -np 2 -hetero -H localhost,10.11.14.205
/tmp/hello

[fuji.local:02267] mca: base: components_open: Looking for btl components
[fuji.local:02267] mca: base: components_open: opening btl components
[fuji.local:02267] mca: base: components_open: found loaded component self
[fuji.local:02267] mca: base: components_open: component self has no
register function
[fuji.local:02267] mca: base: components_open: component self open
function successful
[fuji.local:02267] mca: base: components_open: found loaded component sm
[fuji.local:02267] mca: base: components_open: component sm has no
register function
[fuji.local:02267] mca: base: components_open: component sm open function
successful
[fuji.local:02267] mca: base: components_open: found loaded component tcp
[fuji.local:02267] mca: base: components_open: component tcp has no
register function
[fuji.local:02267] mca: base: components_open: component tcp open function
successful
[fuji.local:02267] select: initializing btl component self
[fuji.local:02267] select: init of component self returned success
[fuji.local:02267] select: initializing btl component sm
[fuji.local:02267] select: init of component sm returned success
[fuji.local:02267] select: initializing btl component tcp
[fuji.local][[59424,1],0][btl_tcp_component.c:468:mca_btl_tcp_component_create_instances]
invalid interface "wlan0"
[fuji.local:02267] select: init of component tcp returned success
[apex-backpack:31956] mca: base: components_open: Looking for btl components
[apex-backpack:31956] mca: base: components_open: opening btl components
[apex-backpack:31956] mca: base: components_open: found loaded component self
[apex-backpack:31956] mca: base: components_open: component self has no
register function
[apex-backpack:31956] mca: base: components_open: component self open
function successful
[apex-backpack:31956] mca: base: components_open: found loaded component sm
[apex-backpack:31956] mca: base: components_open: component sm has no
register function
[apex-backpack:31956] mca: base: components_open: component sm open
function successful
[apex-backpack:31956] mca: base: components_open: found loaded component tcp
[apex-backpack:31956] mca: base: components_open: component tcp has no
register function
[apex-backpack:31956] mca: base: components_open: component tcp open
function successful
[apex-backpack:31956] select: initializing btl component self
[apex-backpack:31956] select: init of component self returned success
[apex-backpack:31956] select: initializing btl component sm
[apex-backpack:31956] select: init of component sm returned success
[apex-backpack:31956] select: initializing btl component tcp
[apex-backpack][[59424,1],1][btl_tcp_component.c:468:mca_btl_tcp_component_create_instances]
invalid interface "en0"
[apex-backpack:31956] select: init of component tcp returned success
Process 0 on fuji.local out of 2
Process 1 on apex-backpack out of 2
[apex-backpack:31956] btl: tcp: attempting to connect() to address
10.11.14.203 on port 9360

It launches the processes on both ends and then it hangs at the send
receive part..!!
What is the other thing that you were mentioning which makes you think
that its not working?!?
Please suggest..
--regards, pallab

> The -enable-heterogeneous should do the trick. And to answer the
> previous question, yes, put both of the interfaces in the include list.
>
> --mca btl_tcp_if_include en0,wlan0
>
> If that does not work, then I may have one other thought why it might
> not work although perhaps not a solution.
>
> Rolf
>
> Pallab Datta wrote:
>> Hi Rolf,
>>
>> Do i need to configure openmpi with some specific options apart from
>> --enable-heterogeneous..?
>> I am currently using
>> ./configure --prefix=/usr/local/ --enable-heterogeneous --disable-static
>> --enable-shared --enable-debug
>>
>> on both ends...is the above correct..?! Please let me know.
>> thanks and regards,
>> pallab
>>
>>
>>> Hi:
>>> I assume if you wait several minutes than your program will actually
>>> time out, yes? I guess I have two suggestions. First, can you run a
>>> non-MPI job using the wireless? Something like hostname? Secondly,
>>> you
>>> may want to specify the specific interfaces you want it to use on the
>>> two machines. You can do that via the "--mca btl_tcp_if_include"
>>> run-time parameter. Just list the ones that you expect it to use.
>>>
>>> Also, this is not right - "--mca OMPI_mca_mpi_preconnect_all 1" It
>>> should be --mca mpi_preconnect_mpi 1 if you want to do the connection
>>> during MPI_Init.
>>>
>>> Rolf
>>>
>>> Pallab Datta wrote:
>>>
>>>> The following is the error dump
>>>>
>>>> fuji:src pallabdatta$ /usr/local/bin/mpirun --mca btl_tcp_port_min_v4
>>>> 36900 -mca btl_tcp_port_range_v4 32 --mca btl_base_verbose 30 --mca
>>>> btl
>>>> tcp,self --mca OMPI_mca_mpi_preconnect_all 1 -np 2 -hetero -H
>>>> localhost,10.11.14.205 /tmp/hello
>>>> [fuji.local:01316] mca: base: components_open: Looking for btl
>>>> components
>>>> [fuji.local:01316] mca: base: components_open: opening btl components
>>>> [fuji.local:01316] mca: base: components_open: found loaded component
>>>> self
>>>> [fuji.local:01316] mca: base: components_open: component self has no
>>>> register function
>>>> [fuji.local:01316] mca: base: components_open: component self open
>>>> function successful
>>>> [fuji.local:01316] mca: base: components_open: found loaded component
>>>> tcp
>>>> [fuji.local:01316] mca: base: components_open: component tcp has no
>>>> register function
>>>> [fuji.local:01316] mca: base: components_open: component tcp open
>>>> function
>>>> successful
>>>> [fuji.local:01316] select: initializing btl component self
>>>> [fuji.local:01316] select: init of component self returned success
>>>> [fuji.local:01316] select: initializing btl component tcp
>>>> [fuji.local:01316] select: init of component tcp returned success
>>>> [apex-backpack:04753] mca: base: components_open: Looking for btl
>>>> components
>>>> [apex-backpack:04753] mca: base: components_open: opening btl
>>>> components
>>>> [apex-backpack:04753] mca: base: components_open: found loaded
>>>> component
>>>> self
>>>> [apex-backpack:04753] mca: base: components_open: component self has
>>>> no
>>>> register function
>>>> [apex-backpack:04753] mca: base: components_open: component self open
>>>> function successful
>>>> [apex-backpack:04753] mca: base: components_open: found loaded
>>>> component
>>>> tcp
>>>> [apex-backpack:04753] mca: base: components_open: component tcp has no
>>>> register function
>>>> [apex-backpack:04753] mca: base: components_open: component tcp open
>>>> function successful
>>>> [apex-backpack:04753] select: initializing btl component self
>>>> [apex-backpack:04753] select: init of component self returned success
>>>> [apex-backpack:04753] select: initializing btl component tcp
>>>> [apex-backpack:04753] select: init of component tcp returned success
>>>> Process 0 on fuji.local out of 2
>>>> Process 1 on apex-backpack out of 2
>>>> [apex-backpack:04753] btl: tcp: attempting to connect() to address
>>>> 10.11.14.203 on port 9360
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> Hi
>>>>>
>>>>> I am trying to run open-mpi 1.3.3. between a linux box running ubuntu
>>>>> server v.9.04 and a Macintosh. I have configured openmpi with the
>>>>> following options.:
>>>>> ./configure --prefix=/usr/local/ --enable-heterogeneous
>>>>> --disable-shared
>>>>> --enable-static
>>>>>
>>>>> When both the machines are connected to the network via ethernet
>>>>> cables
>>>>> openmpi works fine.
>>>>>
>>>>> But when I switch the linux box to a wireless adapter i can reach
>>>>> (ping)
>>>>> the macintosh
>>>>> but openmpi hangs on a hello world program.
>>>>>
>>>>> I ran :
>>>>>
>>>>> /usr/local/bin/mpirun --mca btl_tcp_port_min_v4 36900 -mca
>>>>> btl_tcp_port_range_v4 32 --mca btl_base_verbose 30 --mca
>>>>> OMPI_mca_mpi_preconnect_all 1 -np 2 -hetero -H localhost,10.11.14.205
>>>>> /tmp/back
>>>>>
>>>>> it hangs on a send receive function between the two ends. All my
>>>>> firewalls
>>>>> are turned off at the macintosh end. PLEASE HELP ASAP>
>>>>> regards,
>>>>> pallab
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users_at_[hidden]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>>
>>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>>
>>> --
>>>
>>> =========================
>>> rolf.vandevaart_at_[hidden]
>>> 781-442-3043
>>> =========================
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>>
>
>
> --
>
> =========================
> rolf.vandevaart_at_[hidden]
> 781-442-3043
> =========================
>
>