Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] tcp connectivity OS X and 1.3.3
From: Jody Klymak (jklymak_at_[hidden])
Date: 2009-08-13 12:22:39


On Aug 12, 2009, at 19:09 PM, Ralph Castain wrote:

> Hmmm...well, I'm going to ask our TCP friends for some help here.
>
> Meantime, I do see one thing that stands out. Port 4 is an awfully
> low port number that usually sits in the reserved range. I checked
> the /etc/services file on my Mac, and it was commented out as
> unassigned, which should mean it was okay.
>
> Still, that is an unusual number. The default minimum port number is
> 1024, so I'm puzzled how you wound up down there. Of course, could
> just be an error in the print statement, but let's try moving it to
> be safe? Set
>
> -mca btl_tcp_port_min_v4 36900 -mca btl_tcp_port_range_v4 32
> and see what happens.

What happens is that everything works now! Both connectivity_c and
the MITgcm. I haven't tried under torque yet, but lets declare an
openMPI victory at this point.

On Aug 13, 2009, at 8:28 AM, Jeff Squyres wrote:

> Agreed -- ports 4 and 260 should be in the reserved ports range.
> Are you running as root, perchance?

Errrr, no, but yes. My user account has admin privledges. A sloppy
workstation OS X habit I now regret propagating to my cluster. I'm
sorry to not mention it earlier as possibly relevant.

As a suggestion, btl_base_verbose could be mentioned as a good
debugging tool in the troubleshooting section of the FAQ. Its on the
page to do with tcp, which I admit I should have read as soon as I
realized there was a communication issue, but having it in the
troubleshooting section would be helpful too. i.e. maybe a more
erudite version of:

Checking connections between nodes:

Sometimes the configuration of a cluster makes it impossible for nodes
to communicate properly. To debug this it helps to include --mca
btl_base_verbose 30 as a command line argument (see http://www.open-mpi.org/faq/?category=tcp
  for more information). The program example/connectivity_c.c is also
a useful minimal program for testing communication on the cluster.

Thanks again for everyone's help, particularly Ralph, Jeff and Gus.

Cheers, Jody