Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2005-09-27 15:53:35


This exact problem came up in a different context today.

This is only a side-effect of us having crummy error messages. :-(

What is happening is that OMPI is not finding its components.
Specifically, it's looking for the SDS components in this case, not
finding them, and then barfing.

Open MPI, by default, looks in $prefix/lib/openmpi and
$HOME/.openmpi/components for its components. This is set with the
"mca_component_path" MCA parameter -- you can certainly change it to be
whatever you need. For example:

-----
[15:26] odin:~/svn/ompi/ompi/runtime % ompi_info --param mca all
[snipped]
                  MCA mca: parameter "mca_component_path" (current value:
                            
"/u/jsquyres/bogus/lib/openmpi:/u/jsquyres/.openmpi components")
                           Path where to look for Open MPI and ORTE
components
[snipped]
-----

So you should be able to:

        orteun --mca mca_component_path /path/where/you/have/them ...

Disclaimer: this *used* to work, but I haven't tried it in a long time.
  There's no reason that it shouldn't work, but we all know how bit rot
happens...

However, be aware that the wrapper compilers are still hard-coded to
look in $prefix/lib to link the OMPI/ORTE/OPAL compilers. You can
override that stuff with environment variables if you need to, but it's
not desirable.

Sidenote: in LAM, we had a single, top-level environment variable named
LAMHOME that would override all this stuff. However, we found that it
*really* confused most users -- there were very, very few instances
where there was a genuine need for it. So we didn't add a single,
top-level control like that in OMPI.

On Sep 27, 2005, at 4:27 PM, Greg Watson wrote:

> Hi,
>
> Trying to install ompi on a bproc machine with no network filesystem.
> I've copied the contents of the ompi lib directory into /tmp/ompi on
> each node and set my LD_LIBRARY_PATH to /tmp/ompi. However when I run
> the program, I get the following error. Any suggestions on what else
> I need to do?
>
> Thanks,
>
> Greg
>
> [n0:31161] [NO-NAME] ORTE_ERROR_LOG: Not found in file
> orte_init_stage1.c at line 191
> [n0:31161] [NO-NAME] ORTE_ERROR_LOG: Not found in file
> orte_system_init.c at line 39
> [n0:31161] [NO-NAME] ORTE_ERROR_LOG: Not found in file orte_init.c at
> line 47
> -----------------------------------------------------------------------
> -
> --
> Sorry! You were supposed to get help about:
> orted:init-failure
> from the file:
> help-orted.txt
> But I couldn't find any file matching that name. Sorry!
> -----------------------------------------------------------------------
> -
> --
> -----------------------------------------------------------------------
> -
> --
> A daemon (pid 31161) launched by the bproc PLS component on node 0 died
> unexpectedly so we are aborting.
>
> This may be because the daemon was unable to find all the needed shared
> libraries on the remote node. You may set your LD_LIBRARY_PATH to
> have the
> location of the shared libraries on the remote nodes and this will
> automatically be forwarded to the remote nodes.
> -----------------------------------------------------------------------
> -
> --
> [bluesteel.lanl.gov:31160] [0,0,0] ORTE_ERROR_LOG: Error in file
> pls_bproc.c at line 870
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

-- 
{+} Jeff Squyres
{+} The Open MPI Project
{+} http://www.open-mpi.org/