On Wed, Mar 17, 2010 at 1:41 PM, Jeff Squyres <jsquyres_at_[hidden]> wrote:
> On Mar 17, 2010, at 4:39 AM, <uriz.49949_at_[hidden]> wrote:
>> Hi everyone I'm a new Open MPI user and I have just installed Open MPI in
>> a 6 nodes cluster with Scientific Linux. When I execute it in local it
>> works perfectly, but when I try to execute it on the remote nodes with the
>> --host option it hangs and gives no message. I think that the problem
>> could be with the shared libraries but i'm not sure. In my opinion the
>> problem is not ssh because i can access to the nodes with no password
> You might want to check that Open MPI processes are actually running on the remote nodes -- check with ps if you see any "orted" or other MPI-related processes (e.g., your processes).
> Do you have any TCP firewall software running between the nodes? If so, you'll need to disable it (at least for Open MPI jobs).
I also recommend running mpirun with the option --mca btl_base_verbose
30 to troubleshoot tcp issues.
In some environments, you need to explicitly tell mpirun what network
interfaces it can use to reach the hosts. Read the following FAQ
section for more information:
Item 7 of the FAQ might be of special interest.