Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] getting opal_init:startup:internal-failure
From: E.O. (ooyama.eiichi_at_[hidden])
Date: 2013-04-28 11:58:23


Thank you Ralph!
I ran it with "-prefix" option but I got this...

[root_at_host1 tmp]# mpirun -prefix /myname -np 4 -host host2 ./hello.out
--------------------------------------------------------------------------
mpirun was unable to launch the specified application as it could not access
or execute an executable:

Executable: -prefix=/myname
Node: host1

while attempting to start process rank 0.
--------------------------------------------------------------------------
[root_at_host1 tmp]#

I also updated PATH in the remote host (host2) to include /myname.
But it didn't seem change anything...

eiichi

On Sun, Apr 28, 2013 at 11:48 AM, Ralph Castain <rhc_at_[hidden]> wrote:

> The problem is likely that your path variables aren't being set properly
> on the remote machine when mpirun launches the remote daemon. You might
> check to see that your default shell rc file is also setting those values
> correctly. Alternatively, modify your mpirun cmd line a bit by adding
>
> mpirun -prefix /myname ...
>
> so it will set the remove prefix and see if that helps. If it does, you
> can add --enable-orterun-prefix-by-default to your configure line so mpirun
> always adds it.
>
>
> On Apr 28, 2013, at 7:56 AM, "E.O." <ooyama.eiichi_at_[hidden]> wrote:
>
> > Hello
> >
> > I have five linux machines (one is redhat and the other are busybox)
> > I downloaded openmpi-1.6.4.tar.gz into my main redhat machine and
> configure'ed/compiled it successfully.
> > ./configure --prefix=/myname
> > I installed it to /myname directory successfully. I am able to run a
> simple hallo.c on my redhat machine.
> >
> > [root_at_host1 /tmp] # mpirun -np 4 ./hello.out
> > I am parent
> > I am a child
> > I am a child
> > I am a child
> > [root_at_host1 /tmp] #
> >
> > Then, I sent entire /myname directory to the another machine (host2).
> > [root_at_host1 /] # tar zcf - myname | ssh host2 "(cd /; tar zxf -)"
> >
> > and ran mpirun for the host (host2).
> >
> > [root_at_host1 tmp]# mpirun -np 4 -host host2 ./hello.out
> >
> --------------------------------------------------------------------------
> > Sorry! You were supposed to get help about:
> > opal_init:startup:internal-failure
> > But I couldn't open the help file:
> > //share/openmpi/help-opal-runtime.txt: No such file or directory.
> Sorry!
> >
> --------------------------------------------------------------------------
> > [host2:26294] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file
> runtime/orte_init.c at line 79
> > [host2:26294] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file
> orted/orted_main.c at line 358
> >
> --------------------------------------------------------------------------
> > A daemon (pid 23691) died unexpectedly with status 255 while attempting
> > to launch so we are aborting.
> >
> > There may be more information reported by the environment (see above).
> >
> > This may be because the daemon was unable to find all the needed shared
> > libraries on the remote node. You may set your LD_LIBRARY_PATH to have
> the
> > location of the shared libraries on the remote nodes and this will
> > automatically be forwarded to the remote nodes.
> >
> --------------------------------------------------------------------------
> >
> --------------------------------------------------------------------------
> > mpirun noticed that the job aborted, but has no info as to the process
> > that caused that situation.
> >
> --------------------------------------------------------------------------
> > [root_at_host1 tmp]#
> >
> > I set those environment variables
> >
> > [root_at_host1 tmp]# echo $LD_LIBRARY_PATH
> > /myname/lib/
> > [root_at_host1 tmp]# echo $OPAL_PREFIX
> > /myname/
> > [root_at_host1 tmp]#
> >
> > [root_at_host2 /] # ls -la /myname/lib/libmpi.so.1
> > lrwxrwxrwx 1 root root 15 Apr 28 10:21
> /myname/lib/libmpi.so.1 -> libmpi.so.1.0.7
> > [root_at_host2 /] #
> >
> > If I ran the ./hello.out binary inside host2, it works fine
> >
> > [root_at_host1 tmp]# ssh host2
> > [root_at_host2 /] # /tmp/hello.out
> > I am parent
> > [root_at_host2 /] #
> >
> > Can someone help me figure out why I cannot run hello.out in host2 from
> host1 ?
> > Am I missing any env variables ?
> >
> > Thank you,
> >
> > Eiichi
> >
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>