Hi:
I assume if you wait several minutes than your program will actually time out, yes?  I guess I have two suggestions. First, can you run a non-MPI job using the wireless?  Something like hostname?  Secondly, you may want to specify the specific interfaces you want it to use on the two machines.  You can do that via the "--mca btl_tcp_if_include" run-time parameter.  Just list the ones that you expect it to use.

Also, this is not right - "--mca OMPI_mca_mpi_preconnect_all 1"  It should be --mca mpi_preconnect_mpi 1 if you want to do the connection during MPI_Init.

Rolf
  


Pallab Datta wrote:
The following is the error dump

fuji:src pallabdatta$ /usr/local/bin/mpirun --mca btl_tcp_port_min_v4
36900 -mca btl_tcp_port_range_v4 32 --mca btl_base_verbose 30 --mca btl
tcp,self --mca OMPI_mca_mpi_preconnect_all 1 -np 2 -hetero -H
localhost,10.11.14.205 /tmp/hello
[fuji.local:01316] mca: base: components_open: Looking for btl components
[fuji.local:01316] mca: base: components_open: opening btl components
[fuji.local:01316] mca: base: components_open: found loaded component self
[fuji.local:01316] mca: base: components_open: component self has no
register function
[fuji.local:01316] mca: base: components_open: component self open
function successful
[fuji.local:01316] mca: base: components_open: found loaded component tcp
[fuji.local:01316] mca: base: components_open: component tcp has no
register function
[fuji.local:01316] mca: base: components_open: component tcp open function
successful
[fuji.local:01316] select: initializing btl component self
[fuji.local:01316] select: init of component self returned success
[fuji.local:01316] select: initializing btl component tcp
[fuji.local:01316] select: init of component tcp returned success
[apex-backpack:04753] mca: base: components_open: Looking for btl components
[apex-backpack:04753] mca: base: components_open: opening btl components
[apex-backpack:04753] mca: base: components_open: found loaded component self
[apex-backpack:04753] mca: base: components_open: component self has no
register function
[apex-backpack:04753] mca: base: components_open: component self open
function successful
[apex-backpack:04753] mca: base: components_open: found loaded component tcp
[apex-backpack:04753] mca: base: components_open: component tcp has no
register function
[apex-backpack:04753] mca: base: components_open: component tcp open
function successful
[apex-backpack:04753] select: initializing btl component self
[apex-backpack:04753] select: init of component self returned success
[apex-backpack:04753] select: initializing btl component tcp
[apex-backpack:04753] select: init of component tcp returned success
Process 0 on fuji.local out of 2
Process 1 on apex-backpack out of 2
[apex-backpack:04753] btl: tcp: attempting to connect() to address
10.11.14.203 on port 9360




  
Hi

I am trying to run open-mpi 1.3.3. between a linux box running ubuntu
server v.9.04 and a Macintosh. I have configured openmpi with the
following options.:
./configure --prefix=/usr/local/ --enable-heterogeneous --disable-shared
--enable-static

When both the machines are connected to the network via ethernet cables
openmpi works fine.

But when I switch the linux box to a wireless adapter i can reach (ping)
the macintosh
but openmpi hangs on a hello world program.

I ran :

/usr/local/bin/mpirun --mca btl_tcp_port_min_v4 36900 -mca
btl_tcp_port_range_v4 32 --mca btl_base_verbose 30 --mca
OMPI_mca_mpi_preconnect_all 1 -np 2 -hetero -H localhost,10.11.14.205
/tmp/back

it hangs on a send receive function between the two ends. All my firewalls
are turned off at the macintosh end. PLEASE HELP ASAP>
regards,
pallab
_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

    

_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
  


-- 

=========================
rolf.vandevaart@sun.com
781-442-3043
=========================