Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD
From: Steve Kargl (sgk_at_[hidden])
Date: 2011-07-08 15:42:11


On Fri, Jul 08, 2011 at 12:09:09PM -0700, Steve Kargl wrote:
> On Fri, Jul 08, 2011 at 02:19:27PM -0400, Jeff Squyres wrote:
> >
> > The easiest way to fix this is likely to use the btl_tcp_if_include
> > or btl_tcp_if_exclude MCA parameters -- i.e., tell OMPI exactly
> > which interfaces to use:
> >
> > http://www.open-mpi.org/faq/?category=tcp#tcp-selection
> >
>
> Perhaps, I'm again misreading the output, but it appears that
> 1.4.4rc2 does not even see the 2nd nic.
>

So, now, I'm very confused! Using '--mca btl_tcp_if_include bge1,bge0'
seems to work even though openmpi says that bge1 is invalid, and if I
reverse the interfaces to '--mca btl_tcp_if_include bge0,bge1' the
process appears stuck. :(

hpc:kargl[341] /usr/local/openmpi-1.4.4/bin/mpiexec --mca btl_base_verbose 10 \
  --mca btl_tcp_if_include bge1,bge0 --mca btl tcp,self -machinefile mf1 ./z
...
[node11.cimu.org][[13885,1],1][btl_tcp_component.c:468:\
mca_btl_tcp_component_create_instances] invalid interface "bge1"
[node11.cimu.org:22024] select: init of component tcp returned success
0: hpc.apl.washington.edu
1: node11.cimu.org
Latency: 0.000073644
Sync Time: 0.000147468
Now starting main loop
  0: 0 bytes 16384 times --> 0.00 Mbps in 0.000073622 sec
  1: 1 bytes 16384 times --> 0.10 Mbps in 0.000073617 sec
  2: 2 bytes 3395 times --> 0.21 Mbps in 0.000073634 sec
  3: 3 bytes 1697 times --> 0.31 Mbps in 0.000073611 sec
...
126: 12582914 bytes 3 times --> 720.84 Mbps in 0.133178830 sec
[hpc.apl.washington.edu:12390] mca: base: close: component self closed
[hpc.apl.washington.edu:12390] mca: base: close: unloading component self
[hpc.apl.washington.edu:12390] mca: base: close: component tcp closed
[hpc.apl.washington.edu:12390] mca: base: close: unloading component tcp
[node11.cimu.org:22024] mca: base: close: component self closed
[node11.cimu.org:22024] mca: base: close: unloading component self
[node11.cimu.org:22024] mca: base: close: component tcp closed
[node11.cimu.org:22024] mca: base: close: unloading component tcp

hpc:kargl[342] /usr/local/openmpi-1.4.4/bin/mpiexec --mca btl_base_verbose 10 \
--mca btl_tcp_if_include bge0,bge1 --mca btl tcp,self -machinefile mf1 ./z
...
[node11.cimu.org][[13868,1],1][btl_tcp_component.c:468:\
mca_btl_tcp_component_create_instances] invalid interface "bge1"
[node11.cimu.org:22048] select: init of component tcp returned success
0: hpc.apl.washington.edu
1: node11.cimu.org

and nothing!

-- 
Steve