Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] tcp communication problems with 1.4.3 and 1.4.4 rc2 on FreeBSD
From: Steve Kargl (sgk_at_[hidden])
Date: 2011-07-12 15:26:30


On Tue, Jul 12, 2011 at 11:03:42AM -0700, Steve Kargl wrote:
> On Tue, Jul 12, 2011 at 10:37:14AM -0700, Steve Kargl wrote:
> > On Fri, Jul 08, 2011 at 07:03:13PM -0400, Jeff Squyres wrote:
> > > Sorry -- I got distracted all afternoon...
> > >
> > > In addition to what Ralph said (i.e., I'm not sure if the CIDR
> > > notation stuff made it over to the v1.5 branch or not, but it
> > > is available from the nightly SVN trunk tarballs:
> > > http://www.open-mpi.org/nightly/trunk/), here's a few points
> > > from other mails in this thread...
> > >
> >
> > trunk does not appear to be an option. :-(
> >
> > % svn co http://svn.open-mpi.org/svn/ompi/trunk ompi
> > % cd ompi
> > % ./autogen.pl
> > % ./configure --enable-mpirun-prefix-by-default --prefix=/usr/local/ompi \
> > --disable-shared --enable-static
> >
> > (many lines removed)
> > checking prefix for function in .type... @
> > checking if .size is needed... yes
> > checking if .align directive takes logarithmic value... no
> > configure: error: No atomic primitives available for amd64-unknown-freebsd9.0
>
> It seems the configure script does not recognize amd64. If I add
> --build='x86_64-*-freebsd' to the configure line, then everything
> appears to work.
>
> I'll report back after I've had a chance to work with ompi built
> from trunk.
>

There's good news and some bad news.

I got trunk to build and install. I compile the netpipe
code with

% /usr/local/ompi/bin/mpicc -o z -O GetOpt.c netmpi.c

Bad news:

I can then run

% /usr/local/ompi/bin/mpiexec -machinefile mf --mca btl self,tcp \
  --mca btl_base_verbose 30 ./z

with mf containing

node11 slots=1 (node11 contains a single bge0=168.192.0.11)
node16 slots=1 (node16 contains a single bge0=168.192.0.16)

or

node11 slots=2 (communication on memory bus)

However, if mf contains

node10 slots=1 (node10 contains bge0=10.208.xx and bge1=192.168.0.10)
node16 slots=1 (node16 contains a single bge0=192.168.0.16)

I see the same problem where node10 cannot communicate with node16.

Good News:

Adding 'btl_tcp_if_include=192.168.0.0/16' to my ~/.openmpi/mca-params.conf
file seems to cure the communication problem.

Thanks for the help. If I run into any other problems with trunk,
I'll report those here.

-- 
Steve