Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Ralph Castain (rhc_at_[hidden])
Date: 2006-12-01 10:48:12


What the system is saying is that (a) you don't have transparent ssh
authority on one or more of your nodes, and/or (b) the system was unable to
locate the Open MPI code libraries on the remote node. For the first
problem, please see the FAQ at:

http://www.open-mpi.org/faq/?category=rsh#ssh-keys

Once you have that fixed, then you should check the remote nodes to ensure
that the Open MPI code libraries are available - you may need to provide a
prefix directory to mpirun to tell us where they are. Please see the FAQ at:

http://www.open-mpi.org/faq/?category=running

For some advice in that area.

Hope that helps
Ralph

On 12/1/06 8:17 AM, "Jens Klostermann"
<jens.klostermann_at_[hidden]> wrote:

> I 've got the same problem as described in:
> http://www.open-mpi.org/community/lists/users/2006/07/1537.php
>
> From: Chengwen Chen (chenchengwen_at_[hidden])
> Date: 2006-07-04 03:53:26
>
>
>
> The problem seems to occur randomly! It occurs more often if I use a
> larger number of cpu, but always never if I use a small number of cpus.
> So far my cure to the problem is to kill and restart my application
> (mpirun ...) as often untill the error won't occur and mpirun will run.
>
> So is the problem resolved. Can anybody give me an hint?
>
> I am using a amd64 linux (suse10) cluster with infiniband conection and
> openmpi-1.2a1r10111.
>
> I attach the ompi_info --param all all output, hope it helps.
>
> Regards Jens
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users