Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Error using hostfile
From: Reuti (reuti_at_[hidden])
Date: 2011-07-07 12:33:50


Hi,

Am 07.07.2011 um 01:09 schrieb Mohan, Ashwin:

> I use the following command (mpirun --prefix /usr/local/openmpi1.4.3 -np 4 hello) to successfully execute a simple hello world command on a single node. Each node has 4 slots. Following the successful execution on one node, I wish to employ 4 nodes and for this purpose wrote a hostfile. I submitted my job using the following command:

looks like you will either have to setup a passphraseless ssh login for each user between the machines, or do it one time inside the cluster using hostbased authentication:

http://arc.liv.ac.uk/SGE/howto/hostbased-ssh.html

You have the same users on all machines with the same UID and GID?

-- Reuti

> mpirun --prefix /usr/local/openmpi1.4.3 -np 4 --hostfile hostfile hello
>
>
>
> Copied below is the output. How do I go about fixing this issue.
>
>
>
> **********************************************************************
>
>
>
> amohan_at_myocyte48's password: amohan_at_myocyte47's password:
>
> Permission denied, please try again.
>
> amohan_at_myocyte48's password:
>
> Permission denied, please try again.
>
> amohan_at_myocyte47's password:
>
> Permission denied, please try again.
>
> amohan_at_myocyte47's password:
>
> Permission denied, please try again.
>
> amohan_at_myocyte48's password:
>
>
>
> Permission denied (publickey,gssapi-with-mic,password).
>
> --------------------------------------------------------------------------
>
> A daemon (pid 22085) died unexpectedly with status 255 while attempting
>
> to launch so we are aborting.
>
>
>
> There may be more information reported by the environment (see above).
>
>
>
> This may be because the daemon was unable to find all the needed shared
>
> libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
>
> location of the shared libraries on the remote nodes and this will
>
> automatically be forwarded to the remote nodes.
>
> --------------------------------------------------------------------------
>
> --------------------------------------------------------------------------
>
> mpirun noticed that the job aborted, but has no info as to the process
>
> that caused that situation.
>
> --------------------------------------------------------------------------
>
> --------------------------------------------------------------------------
>
> mpirun was unable to cleanly terminate the daemons on the nodes shown
>
> below. Additional manual cleanup may be required - please refer to
>
> the "orte-clean" tool for assistance.
>
> --------------------------------------------------------------------------
>
> myocyte47 - daemon did not report back when launched
>
> myocyte48 - daemon did not report back when launched
>
>
>
> **********************************************************************
>
>
>
> Thanks,
>
> Ashwin.
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users