Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] can't preload binary to remote machine
From: Ralph Castain (rhc_at_[hidden])
Date: 2014-06-03 23:46:30


Sorry for delayed response - been a little hectic here.

I suspect the problem is that we really need a passwordless ssh connection in order to preload the file for 1.6.5. This isn't required in the 1.8 series, so you might want to try it with 1.8.1. Otherwise, resolve the password issue and it should work.

On May 20, 2014, at 6:28 AM, Cordone, Guthrie <gcordone_at_[hidden]> wrote:

> Hello,
>
> I have two linux machines, each running Open MPI 1.6.5. I want to use the preload binary command in an appfile to execute a binary from the host on both the node and the host during mpirun. Right now I am using an appfile with the contents:
>
> #appfile.test
>
>
>
> -host user_at_remotehost --preload-binary -np 1 run_date
>
> -host localhost -np 1 run_date
>
> where 'run_date' is an executable that creates a text file with the current date. I run the appfile using the command:
>
> mpirun -app appfile.test
>
> I enter user_at_remotehost's password when prompted and then immediately receive an error:
>
> --------------------------------------------------------------------------
>
> WARNING: Remote peer ([[53924,0],1]) failed to preload a file.
>
>
>
> Exit Status: 256
>
> Local File: /tmp/openmpi-sessions-user_at_remotehost_0/53924/0/run_date
>
> Remote File: run_date
>
> Command:
>
> scp localhost:/home/user/appfileTest/run_date /tmp/openmpi-sessions-user_at_remotehost_0/53924/0/run_date
>
>
>
> Will continue attempting to launch the process(es).
>
> --------------------------------------------------------------------------
>
> --------------------------------------------------------------------------
>
> mpirun was unable to launch the specified application as it could not access
>
> or execute an executable:
>
>
>
> Executable: run_date
>
> Node: user_at_remotehost
>
>
>
> while attempting to start process rank 0.
>
> --------------------------------------------------------------------------
>
> After this error, I get returned to the command line to see that the 'run_date' binary has been executed on the localhost but not the remotehost.
>
> I have been able to run on both machines by manually placing the binary on the remotehost and removing the '-preload-binary' command from the appfile, however I need the appfile to place the binary for me. I have also tried setting the remote machines directory using '-wdir' but receive the same error.
>
> Do you guys know what the issue is?
>
>
> Guthrie Cordone
> Systems Engineering Intern
> Phone: 315-883-4484
> gcordone_at_[hidden]
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users