Thanks for tracking this down!
Greg Watson wrote:
> Hi all,
> To recap: the problem was that if orted was launched from Eclipse (on
> OS X) then subsequent attempts to run a program (using mpirun or
> whatever) returned immediately. If orted was launched from anywhere
> else (java, command line, etc.) it worked fine.
> Turning on daemon logging showed that the reason that the program was
> aborting immediately was that the execv() of the ssh command to the
> remote machine was exiting with errno=14 (EFAULT). Clearly there was
> some environment difference, and after much checking it became
> apparent that the difference was that the Eclipse-launched orted did
> not have $(OMPI_INSTALL) in it's path. The orte_pls_rsh_launch()
> function checks if you're launching onto the local or a remote
> machine. For local machines (as it was in this case), it calls
> opal_path_findv() to find the local path of orted. Unfortunately
> because $(OMPI_INSTALL) is not included in the local path, this fails
> by returning NULL. The NULL is then passed to the first argument of
> execv() which returns EFAULT.
> The problem is easily reproducible by taking $(OMPI_INSTALL) out of
> your path, running $(OMPI_INSTALL)/orted, then trying to run
> something with mpirun.
> Why did it work from the command line? On OS X, the shell gets the
> PATH set in ~/.bash_profile, etc., (which in this case contained
> OMPI_INSTALL) but applications launched from window system get their
> path from the loginwindow app, which looks in ~/.MacOSX/
> environment.plist for environment variables (which didn't contain
> OMPI_INSTALL). I suspect, but haven't tried, launching Eclipse from
> the command line would have worked.
> I'm not sure why the logic is there to look up the path again for
> local launches, since it should be the same as the path in the
> component. It should certainly check for a NULL return though.
> devel mailing list