Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Tim S. Woodall (twoodall_at_[hidden])
Date: 2005-07-28 09:24:50


Greg,

Thanks for tracking this down!
Tim

Greg Watson wrote:
> Hi all,
>
> To recap: the problem was that if orted was launched from Eclipse (on
> OS X) then subsequent attempts to run a program (using mpirun or
> whatever) returned immediately. If orted was launched from anywhere
> else (java, command line, etc.) it worked fine.
>
> Turning on daemon logging showed that the reason that the program was
> aborting immediately was that the execv() of the ssh command to the
> remote machine was exiting with errno=14 (EFAULT). Clearly there was
> some environment difference, and after much checking it became
> apparent that the difference was that the Eclipse-launched orted did
> not have $(OMPI_INSTALL) in it's path. The orte_pls_rsh_launch()
> function checks if you're launching onto the local or a remote
> machine. For local machines (as it was in this case), it calls
> opal_path_findv() to find the local path of orted. Unfortunately
> because $(OMPI_INSTALL) is not included in the local path, this fails
> by returning NULL. The NULL is then passed to the first argument of
> execv() which returns EFAULT.
>
> The problem is easily reproducible by taking $(OMPI_INSTALL) out of
> your path, running $(OMPI_INSTALL)/orted, then trying to run
> something with mpirun.
>
> Why did it work from the command line? On OS X, the shell gets the
> PATH set in ~/.bash_profile, etc., (which in this case contained
> OMPI_INSTALL) but applications launched from window system get their
> path from the loginwindow app, which looks in ~/.MacOSX/
> environment.plist for environment variables (which didn't contain
> OMPI_INSTALL). I suspect, but haven't tried, launching Eclipse from
> the command line would have worked.
>
> I'm not sure why the logic is there to look up the path again for
> local launches, since it should be the same as the path in the
> component. It should certainly check for a NULL return though.
>
> Greg
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>