On Jul 30, 2009, at 6:36 AM, David Doria wrote:
On Wed, Jul 29, 2009 at 4:57 PM, Ralph Castain <firstname.lastname@example.org>
Ah, so there is a firewall involved? That is always a problem. I gather that node 126 has clear access to all other nodes, but nodes 122, 123, and 125 do not all have access to each other?
See if your admin is willing to open at least one port on each node that can reach all other nodes. It is easiest if it is the same port for every node, but not required. Then you can try setting the mca params oob_tcp_port_minv4 and oob_tcp_port_rangev4. This should allow the daemons to communicate.
Check ompi_info --param oob tcp for info on those (and other) params.
On Jul 29, 2009, at 2:46 PM, David Doria wrote:
Machine 125 had the default fedora firewall turned on. I turned it off and it works now with simply
mpirun -H 10.1.2.126,10.1.2.122,10.1.2.123,10.1.2.125 hello-mpi
(the firewalls on the rest of the machines were already off in an attempt to avoid problems like this - I guess I just forgot one!)
Is there a "standard" port I can open on these local firewalls so I don't have to disable them completely and so I don't have to set mca params oob_tcp_port_X ?
I'm afraid not - however, you can set those params inside the default MCA param file so you never have to type them again.
users mailing list