Hello Community/Ralph

I was told by the sysadmin that the firewall does not prevent communication between two machines (tik33x, tik34x) for instance. However, it will only block if OpenMPI is trying to open TCP/UDP ports lower than 1024, which require privileges.

Is it possible to know which port numbers does OpenMPI use? Specifically, is it possible to specify port numbers that OpenMPI must not use (OpenMPI-1.4.x)?

Here is the reply I got from my sysadmin:

"There is a firewall, but it does not block internal
traffic within the whole TIK network (I verified it for myself).
Thus, the connection problem must be somewhere else
(a service not running or binding to the wrong interface
for instance). Maybe the service wants to bind to a
tcp or udp port lower than 1024, which can only be
allocated by the system's superuser. First, check on
which ports and on which network card interfaces
the software listens and if it is configured correctly
so that it will listen at all."


Is there a way out?

Thanks a lot

Devendra


From: Ralph Castain <rhc@open-mpi.org>
To: devendra rai <rai.devendra@yahoo.co.uk>; Open MPI Users <users@open-mpi.org>
Sent: Wednesday, 16 May 2012, 15:09
Subject: Re: [OMPI users] Returned "Unreachable" (-12) instead of "Success" (0)

Looks like you have a firewall between hosts tik34x and tik33x - you might check to ensure all firewalls are disabled. The error is saying it can't open a TCP socket between the two nodes, so there is no communication path between those two processes.


On May 16, 2012, at 4:22 AM, devendra rai wrote:

Hello All,

I am trying to run an OpenMPI application across two physical machines.

I get an error "Returned "Unreachable" (-12) instead of "Success" (0)", and looking through the logs (attached), I cannot seem to find out the cause, and how to fix it.

I see lot of (communication) components being loaded and then unloaded, and I do not see which nodes pick up what kind of comm-interface.

--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications.  This means that no Open MPI device has indicated
that it can be used to communicate between these processes.  This is
an error; Open MPI requires that all MPI processes be able to reach
each other.  This error can sometimes be the result of forgetting to
specify the "self" BTL.

  Process 1 ([[10782,1],6]) is on host: tik34x
  Process 2 ([[10782,1],0]) is on host: tik33x
  BTLs attempted: self sm tcp

Your MPI job is now going to abort; sorry.

The "mpirun" line is:

mpirun --mca btl self,sm,tcp --mca btl_base_verbose 30 -report-pid -display-map -report-bindings -hostfile hostfile -np 7 -v --rankfile rankfile.txt -v --timestamp-output --tag-output ./xstartwrapper.sh ./run_gdb.sh 

where the .sh files are fixes for forwarding X-windows from multiple machines to the machines where I am logged in.

Can anyone help?

Thanks a lot.

Best,

Devendra



From: devendra rai <dev641@yahoo.co.uk>
Subject: Returned "Unreachable" (-12) instead of "Success" (0)
Date: May 16, 2012 4:18:28 AM MDT
To: Open MPI Users <users@open-mpi.org>
Reply-To: devendra rai <rai.devendra@yahoo.co.uk>



Hello All,

I am trying to run an OpenMPI application across two physical machines.

I get an error "Returned "Unreachable" (-12) instead of "Success" (0)", and looking through the logs (attached), I cannot seem to find out the cause, and how to fix it.

I see lot of (communication) components being loaded and then unloaded, and I do not see which nodes pick up what kind of comm-interface.

--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications.  This means that no Open MPI device has indicated
that it can be used to communicate between these processes.  This is
an error; Open MPI requires that all MPI processes be able to reach
each other.  This error can sometimes be the result of forgetting to
specify the "self" BTL.

  Process 1 ([[10782,1],6]) is on host: tik34x
  Process 2 ([[10782,1],0]) is on host: tik33x
  BTLs attempted: self sm tcp

Your MPI job is now going to abort; sorry.

The "mpirun" line is:

mpirun --mca btl self,sm,tcp --mca btl_base_verbose 30 -report-pid -display-map -report-bindings -hostfile hostfile -np 7 -v --rankfile rankfile.txt -v --timestamp-output --tag-output ./xstartwrapper.sh ./run_gdb.sh 

where the .sh files are fixes for forwarding X-windows from multiple machines to the machines where I am logged in.

Can anyone help?

Thanks a lot.

Best,

Devendra
<MPILog.txt>

<MPILog.txt>_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users