Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] How does authentication between nodes work without password? (Newbie alert on)
From: David Zhang (solarbikedz_at_[hidden])
Date: 2011-02-10 01:58:07


I don't really know what the problem is. It seems like you're doing things
correctly. I'm almost sure you've done all of the following, but just to be
sure:
having the ssh public keys in other computer's authorized_key file.
ssh keys generated without passphrases

On Wed, Feb 9, 2011 at 10:08 PM, Tena Sakai <tsakai_at_[hidden]> wrote:

> Hi,
>
> I have made a bit of progress(?)...
> I made a config file in my .ssh directory on the cloud. It looks like:
> # machine A
> Host domU-12-31-39-07-35-21.compute-1.internal
> HostName domU-12-31-39-07-35-21
> BatchMode yes
> IdentityFile /home/tsakai/.ssh/tsakai
> ChallengeResponseAuthentication no
> IdentitiesOnly yes
>
> # machine B
> Host domU-12-31-39-06-74-E2.compute-1.internal
> HostName domU-12-31-39-06-74-E2
> BatchMode yes
> IdentityFile /home/tsakai/.ssh/tsakai
> ChallengeResponseAuthentication no
> IdentitiesOnly yes
>
> This file exists on both machine A and machine B.
>
> Now When I issue mpirun command as below:
> [tsakai_at_domU-12-31-39-06-74-E2 ~]$ mpirun -app app.ac2
>
> It hungs. I control-C out of it and I get:
>
> mpirun: killing job...
>
>
> --------------------------------------------------------------------------
> mpirun noticed that the job aborted, but has no info as to the process
> that caused that situation.
>
> --------------------------------------------------------------------------
>
> --------------------------------------------------------------------------
> mpirun was unable to cleanly terminate the daemons on the nodes shown
> below. Additional manual cleanup may be required - please refer to
> the "orte-clean" tool for assistance.
>
> --------------------------------------------------------------------------
> domU-12-31-39-07-35-21.compute-1.internal - daemon did not report
> back when launched
>
> Am I making progress?
>
> Does this mean I am past authentication and something else is the problem?
> Does someone have an example .ssh/config file I can look at? There are so
> many keyword-argument paris for this config file and I would like to look
> at
> some very basic one that works.
>
>
> Thank you.
>
> Tena Sakai
> tsakai_at_[hidden]
>
> On 2/9/11 7:52 PM, "Tena Sakai" <tsakai_at_[hidden]> wrote:
>
> Hi
>
> I have an app.ac1 file like below:
> [tsakai_at_vixen local]$ cat app.ac1
> -H vixen.egcrc.org -np 1 Rscript
> /Users/tsakai/Notes/R/parallel/Rmpi/local/fib.R 5
> -H vixen.egcrc.org -np 1 Rscript
> /Users/tsakai/Notes/R/parallel/Rmpi/local/fib.R 6
> -H blitzen.egcrc.org -np 1 Rscript
> /Users/tsakai/Notes/R/parallel/Rmpi/local/fib.R 7
> -H blitzen.egcrc.org -np 1 Rscript
> /Users/tsakai/Notes/R/parallel/Rmpi/local/fib.R 8
>
> The program I run is
> Rscript /Users/tsakai/Notes/R/parallel/Rmpi/local/fib.R x
> Where x is [5..8]. The machines vixen and blitzen each run 2 runs.
>
> Here’s the program fib.R:
> [ tsakai_at_vixen local]$ cat fib.R
> # fib() computes, given index n, fibonacci number iteratively
> # here's the first dozen sequence (indexed from 0..11)
> # 1, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89
>
> fib <- function( n ) {
> a <- 0
> b <- 1
> for ( i in 1:n ) {
> t <- b
> b <- a
> a <- a + t
> }
> a
>
> arg <- commandArgs( TRUE )
> myHost <- system( 'hostname', intern=TRUE )
> cat( fib(arg), myHost, '\n' )
>
> It reads an argument from command line and produces a fibonacci number that
> corresponds to that index, followed by the machine name. Pretty simple
> stuff.
>
> Here’s the run output:
> [tsakai_at_vixen local]$ mpirun -app app.ac1
> 5 vixen.egcrc.org
> 8 vixen.egcrc.org
> 13 blitzen.egcrc.org
> 21 blitzen.egcrc.org
>
> Which is exactly what I expect. So far so good.
>
> Now I want to run the same thing on cloud. I launch 2 instances of the
> same
> virtual machine, to which I get to by:
> [tsakai_at_vixen local]$ ssh –A –I ~/.ssh/tsakai
> machine-instance-A-public-dns
>
> Now I am on machine A:
> [tsakai_at_domU-12-31-39-00-D1-F2 ~]$
>
> [tsakai_at_domU-12-31-39-00-D1-F2 ~]$ # and I can go to machine B without
> password authentication,
> [tsakai_at_domU-12-31-39-00-D1-F2 ~]$ # i.e., use public/private key
> [tsakai_at_domU-12-31-39-00-D1-F2 ~]$
> [tsakai_at_domU-12-31-39-00-D1-F2 ~]$ hostname
> domU-12-31-39-00-D1-F2
> [tsakai_at_domU-12-31-39-00-D1-F2 ~]$ ssh -i .ssh/tsakai
> domU-12-31-39-0C-C8-01
> Last login: Wed Feb 9 20:51:48 2011 from 10.254.214.4
> [tsakai_at_domU-12-31-39-0C-C8-01 ~]$
> [tsakai_at_domU-12-31-39-0C-C8-01 ~]$ # I am now on machine B
> [tsakai_at_domU-12-31-39-0C-C8-01 ~]$ hostname
> domU-12-31-39-0C-C8-01
> [tsakai_at_domU-12-31-39-0C-C8-01 ~]$
> [tsakai_at_domU-12-31-39-0C-C8-01 ~]$ # now show I can get to machine A
> without using password
> [tsakai_at_domU-12-31-39-0C-C8-01 ~]$
> [tsakai_at_domU-12-31-39-0C-C8-01 ~]$ ssh -i .ssh/tsakai
> domU-12-31-39-00-D1-F2
> The authenticity of host 'domu-12-31-39-00-d1-f2 (10.254.214.4)' can't
> be established.
> RSA key fingerprint is e3:ad:75:b1:a4:63:7f:0f:c4:0b:10:71:f3:2f:21:81.
> Are you sure you want to continue connecting (yes/no)? yes
> Warning: Permanently added 'domu-12-31-39-00-d1-f2' (RSA) to the list
> of known hosts.
> Last login: Wed Feb 9 20:49:34 2011 from 10.215.203.239
> [tsakai_at_domU-12-31-39-00-D1-F2 ~]$
> [tsakai_at_domU-12-31-39-00-D1-F2 ~]$ hostname
> domU-12-31-39-00-D1-F2
> [tsakai_at_domU-12-31-39-00-D1-F2 ~]$
> [tsakai_at_domU-12-31-39-00-D1-F2 ~]$ exit
> logout
> Connection to domU-12-31-39-00-D1-F2 closed.
> [tsakai_at_domU-12-31-39-0C-C8-01 ~]$
> [tsakai_at_domU-12-31-39-0C-C8-01 ~]$ exit
> logout
> Connection to domU-12-31-39-0C-C8-01 closed.
> [tsakai_at_domU-12-31-39-00-D1-F2 ~]$
> [tsakai_at_domU-12-31-39-00-D1-F2 ~]$ # back at machine A
> [tsakai_at_domU-12-31-39-00-D1-F2 ~]$ hostname
> domU-12-31-39-00-D1-F2
>
> As you can see, neither machine uses password for authentication; it uses
> public/private key pairs. There is no problem (that I can see) for ssh
> invocation
> from one machine to the other. This is so because I have a copy of public
> key
> and a copy of private key on each instance.
>
> The app.ac file is identical, except the node names:
> [tsakai_at_domU-12-31-39-00-D1-F2 ~]$ cat app.ac1
> -H domU-12-31-39-00-D1-F2 -np 1 Rscript /home/tsakai/fib.R 5
> -H domU-12-31-39-00-D1-F2 -np 1 Rscript /home/tsakai/fib.R 6
> -H domU-12-31-39-0C-C8-01 -np 1 Rscript /home/tsakai/fib.R 7
> -H domU-12-31-39-0C-C8-01 -np 1 Rscript /home/tsakai/fib.R 8
>
> Here’s what happens with mpirun:
>
> [tsakai_at_domU-12-31-39-00-D1-F2 ~]$ mpirun -app app.ac1
> tsakai_at_domu-12-31-39-0c-c8-01's password:
> Permission denied, please try again.
> tsakai_at_domu-12-31-39-0c-c8-01's password: mpirun: killing job...
>
>
> --------------------------------------------------------------------------
> mpirun noticed that the job aborted, but has no info as to the process
> that caused that situation.
>
> --------------------------------------------------------------------------
>
> mpirun: clean termination accomplished
>
> [tsakai_at_domU-12-31-39-00-D1-F2 ~]$
>
> Mpirun (or somebody else?) asks me password, which I don’t have.
> I end up typing control-C.
>
> Here’s my question:
> How can I get past authentication by mpirun where there is no password?
>
> I would appreciate your help/insight greatly.
>
> Thank you.
>
> Tena Sakai
> tsakai_at_[hidden]
>
>
>
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
David Zhang
University of California, San Diego