I've run into a problem involving accessing a remote host via a
router and I think need to understand how opmpi determines ip
addresses. If there's anything posted on this subject, please point
me to it.
Here's the problem:
I've installed opmpi (1.4.3) on a remote system (an Amazon ec2
instance). If the local system I'm working on has a static ip
address (and a direct connection to the internet), there's no
problem. But if the local system accesses the internet through a
router (which itself gets it's ip via dhcp), a call to runmpi
command hangs.
This is not firewall problem - I've disabled the firewalls on all
the system that are involved (and the router).
It is also not an ssh problem. The ssh connection is being made and
it appears that the application has been launched on the remote
system. After the runmpi command has been launched locally, a ps on
the remote system shows a process
orted --daemonize -mca ess env -mca orte_ess_jobid
1187643392 -mca orte_ess_vpid 1 -mca orte_ess_num_procs 2
--hnp-uri 1187643392.0;tcp://192.168.1.101:35272
While I don't really understand the orted process, I assume this
indicates that a command to execute an app has been received and
that opmpi is trying to run it.
I suspect that the problem is related to the '--hnp-uri ...
tcp://192.168.1.101' argument. 192.168.1.101 is the address of my
local system on my local network (attached to the router), which of
course is not accessible over the net. It appears that opmpi is
transmitting the local (static) ip address to the remote host.
It would help to know how opmpi determines and distributes IP
addresses. And if there's any way to control this.
Any thoughts on dealing with this would be greatly appreciated.
Thanks,
bw