Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2005-07-28 08:03:06

Thanks for reporting this. I just committed code to the rsh pls to
specifically check $bindir if the orted is not found in your path (on
the local node). If orted is still not found, it'll now issue a
friendly error message:

[7:58] vogon:~/mpi % mpirun -np 1 hello

The rsh PLS component was not able to find the executable "orted" in
your PATH or in the directory where Open MPI was initially installed,
and therefore cannot continue.
For reference, your current PATH is:
We also looked for orte in the following directory:
[0,0,0] ORTE_ERROR_LOG: ORTE_ERR_NOT_FOUND in file rmgr_urm.c at line  
mpirun: spawn failed with errno=-16
ERROR: A daemon on node vogon failed to start as expected.
ERROR: There may be more information available from
ERROR: the remote shell (see above).
ERROR: The daemon exited unexpectedly with status 240.
[7:59] vogon:~/mpi %
I also included in there an output of your current $PATH, so that  
problems like you ran into are more obvious (some other agent changing  
your PATH to something that you didn't expect).
On Jul 27, 2005, at 12:50 PM, Greg Watson wrote:
> Hi all,
> To recap: the problem was that if orted was launched from Eclipse (on
> OS X) then subsequent attempts to run a program (using mpirun or
> whatever) returned immediately. If orted was launched from anywhere
> else (java, command line, etc.) it worked fine.
> Turning on daemon logging showed that the reason that the program was
> aborting immediately was that the execv() of the ssh command to the
> remote machine was exiting with errno=14 (EFAULT). Clearly there was
> some environment difference, and after much checking it became
> apparent that the difference was that the Eclipse-launched orted did
> not have $(OMPI_INSTALL) in it's path. The orte_pls_rsh_launch()
> function checks if you're launching onto the local or a remote
> machine. For local machines (as it was in this case), it calls
> opal_path_findv() to find the local path of orted. Unfortunately
> because $(OMPI_INSTALL) is not included in the local path, this fails
> by returning NULL. The NULL is then passed to the first argument of
> execv() which returns EFAULT.
> The problem is easily reproducible by taking $(OMPI_INSTALL) out of
> your path, running $(OMPI_INSTALL)/orted, then trying to run
> something with mpirun.
> Why did it work from the command line? On OS X, the shell gets the
> PATH set in ~/.bash_profile, etc., (which in this case contained
> OMPI_INSTALL) but applications launched from window system get their
> path from the loginwindow app, which looks in ~/.MacOSX/
> environment.plist for environment variables (which didn't contain
> OMPI_INSTALL). I suspect, but haven't tried, launching Eclipse from
> the command line would have worked.
> I'm not sure why the logic is there to look up the path again for
> local launches, since it should be the same as the path in the
> component. It should certainly check for a NULL return though.
> Greg
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
{+} Jeff Squyres
{+} The Open MPI Project