Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Enabling debugging and profiling in openMPI (make "CFLAGS=-pg -g")
From: Nifty Tom Mitchell (niftyompi_at_[hidden])
Date: 2009-06-15 14:36:27


On Fri, Jun 12, 2009 at 10:30:49PM +0530, Leo P. wrote:
>
> Thank you Ralph and Samuel.
> Sorry for the complete newbie question.
> The reason that i wanted to study openMPI is because i wanted to make
> open MPI support nodes that are behind NAT or firewall. If you guys
> could give me some pointers on how to go about doing this i would
> appreciate alot. I am considering this for my thesis project.
> Sincerely,
> LEO
> __________________________________________________________________
>
> From: Ralph Castain <rhc_at_[hidden]>
> If you do a "./configure --help" you will get a complete list of the
.....

Is the goal to add hosts behind a NAT to a cluster or is
this a cluster behind a NAT.

For a cluster behind a NAT some issues get in the way.
The first is how mpirun connects to and starts
jobs. To that end compare and contrast
    a) mpirun --hostfile /home/me/filewithhosts ./yourmpiprogram
    b) ssh mpirun --hostfile /home/me/filewithhosts ./yourmpiprogram

Next c) the issue of ./ and /home/me needs to be clear and understood.
Data, files and file system paths need to be managed in a like manner.

In the first a) case mpirun (aka orterun) needs to make an identical connection to
each of the hosts listed in 'filewithhosts'. This is not possible in
the NAT case at the NAT box presents exactly one host behinds it's
IPaddress. In the b) case the host behind the NAT box can make direct
connections to each of the systems listed in 'filewithhosts'. Because
in the b) case "mpirun" can make all the direct connections to each of
the hosts and start the job.

Some of this may be hidden by a batch system that hides the first ssh connection.

As for files c) each of the nodes running a rank needs to see the program
and data files. For the most part this is the same issue NAT or not.

Complications for debugging can involve $DISPLAY for X as well as ssh
X-windows tunneling and display permissions.

One interesting solution built into Open MPI is the use of IPv6.
IPv6 can come to play if you are adding hosts behind a NAT to a cluster
and deploying a cluster behind a NAT.

Lastly host name spaces outside the NAT and behind the NAT can be
a royal pain. When orterun (mpirun) makes it's connections host name
resolution for the ssh/ rsh jobs as well as the transport links needs
to work correctly. Since a NAT is not a router resolving hosts
behind the NAT at best returns an unrouted private internet destination.
Since many campuses route private internets for local use unexpected
netmasks, hostroutes and routing tables may surface.

-- 
	T o m  M i t c h e l l 
	Found me a new hat, now what?