Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] MPI_Init failing in singleton
From: Ralph Castain (rhc_at_[hidden])
Date: 2010-07-07 09:53:20


Check your path and ld_library_path- looks like you are picking up some stale binary for orted and/or stale libraries (perhaps getting the default OMPI instead of 1.4.2) on the machine where it fails.

On Jul 7, 2010, at 7:44 AM, Grzegorz Maj wrote:

> Hi,
> I was trying to run some MPI processes as a singletons. On some of the
> machines they crash on MPI_Init. I use exactly the same binaries of my
> application and the same installation of openmpi 1.4.2 on two machines
> and it works on one of them and fails on the other one. This is the
> command and its output (test is a simple application calling only
> MPI_Init and MPI_Finalize):
>
> LD_LIBRARY_PATH=/home/gmaj/openmpi/lib ./test
> [host01:21866] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file
> ../../../../../orte/mca/ess/hnp/ess_hnp_module.c at line 161
> --------------------------------------------------------------------------
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort. There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems. This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
>
> orte_plm_base_select failed
> --> Returned value Not found (-13) instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
> [host01:21866] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file
> ../../orte/runtime/orte_init.c at line 132
> --------------------------------------------------------------------------
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort. There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems. This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
>
> orte_ess_set_name failed
> --> Returned value Not found (-13) instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
> [host01:21866] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file
> ../../orte/orted/orted_main.c at line 323
> [host01:21865] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a
> daemon on the local node in file
> ../../../../../orte/mca/ess/singleton/ess_singleton_module.c at line
> 381
> [host01:21865] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a
> daemon on the local node in file
> ../../../../../orte/mca/ess/singleton/ess_singleton_module.c at line
> 143
> [host01:21865] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a
> daemon on the local node in file ../../orte/runtime/orte_init.c at
> line 132
> --------------------------------------------------------------------------
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort. There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems. This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
>
> orte_ess_set_name failed
> --> Returned value Unable to start a daemon on the local node (-128)
> instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> It looks like MPI_INIT failed for some reason; your parallel process is
> likely to abort. There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or environment
> problems. This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
>
> ompi_mpi_init: orte_init failed
> --> Returned "Unable to start a daemon on the local node" (-128)
> instead of "Success" (0)
> --------------------------------------------------------------------------
> *** An error occurred in MPI_Init
> *** before MPI was initialized
> *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
> [host01:21865] Abort before MPI_INIT completed successfully; not able
> to guarantee that all other processes were killed!
>
>
> Any ideas on this?
>
> Thanks,
> Grzegorz Maj
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users