On Mar 29, 2007, at 1:08 PM, Jens Klostermann wrote:
> In reply to
> http://www.open-mpi.org/community/lists/users/2006/12/2286.php
>
> I recently switched to openmpi1.2 unfortunately the password problem
> still persists! I generated new rsa keys and made passwordless ssh
> available. This was tested by login to each node per passwordless ssh,
> fortunately there are only 16 nodes:-).
> The funny thing is it seems to be a problem only with my user and
> appears randomly, but more likely if I uses more nodes.
Is the problem still something like this:
----
[say_at_wolf45 tmp]$ mpirun -np 2 --host wolf45,wolf46 /tmp/test.x
orted: Command not found.
-----
Because if so, it's a larger / non-MPI issue. If the orted
executable cannot be found on the remote node, there's no way Open
MPI will succeed.
The question of *why* the orted can't be found may be a bit deeper of
a problem -- if you have your PATH set right, etc., perhaps it's an
NFS issue...?
> One cure for the problem until now is using the option --mca
> pls_rsh_debug. What does this switch do other than producing more
> output
> that this resolves my problem?
It also slows the code down a bit such that the timing is different.
> Two other questions what is the
> -ras (Resource allocation subsystem): and how can I set this up/what
> options to have
I would doubt that the ras is involved in the issue -- the ras is
used to read hostfiles, analyze lists of hosts from resource
managers, etc. It doesn't actually do anything in the actual launch.
> pls (Process launch subsystem): and how can I set this up/what options
> to have?
I assume you're using the RSH launcher; you can use the ompi_info
command to see what parameters are available for that component:
ompi_info --param pls rsh
--
Jeff Squyres
Cisco Systems
|