Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] orte_ns_base_select failed: returned value -1 instead of ORTE_SUCCESS
From: Aurélien Bouteiller (bouteill_at_[hidden])
Date: 2008-01-31 02:18:27


I tried using a fresh trunk, same problem have occured. Here is the
complete configure line. I am using libtool 1.5.22 from fink.
Otherwise everything is standard OS 10.5.

   $ ../trunk/configure --prefix=/Users/bouteill/ompi/build --enable-
mpirun-prefix-by-default --disable-io-romio --enable-debug --enable-
picky --enable-mem-debug --enable-mem-profile --enable-visibility --
disable-dlopen --disable-shared --enable-static

The error message generated by abort contains garbage (line numbers do
not match anything in .c files and according to gdb the failure does
not occur during ns initialization). This looks like a heap corruption
or something as bad.

orterun (argc=4, argv=0xbffff81c) at ../../../../trunk/orte/tools/
orterun/orterun.c:529
529 cb_states = ORTE_PROC_STATE_TERMINATED |
ORTE_PROC_STATE_AT_STG1;
(gdb) n
530 rc = orte_rmgr.spawn_job(apps, num_apps, &jobid, 0, NULL,
job_state_callback, cb_states, &attributes);
(gdb) n
531 while (NULL != (item = opal_list_remove_first(&attributes)))
OBJ_RELEASE(item);
(gdb) n
** Stepping over inlined function code. **
532 OBJ_DESTRUCT(&attributes);
(gdb) n
534 if (orterun_globals.do_not_launch) {
(gdb) n
539 OPAL_THREAD_LOCK(&orterun_globals.lock);
(gdb) n
541 if (ORTE_SUCCESS == rc) {
(gdb) n
542 while (!orterun_globals.exit) {
(gdb) n
543 opal_condition_wait(&orterun_globals.cond,
(gdb) n
[grosse-pomme.local:77335] [NO-NAME] ORTE_ERROR_LOG: Bad parameter in
file /SourceCache/openmpi/openmpi-5/openmpi/orte/mca/oob/base/
oob_base_init.c at line 74

Aurelien

Le 30 janv. 08 à 17:18, Ralph Castain a écrit :

> Are you running on the trunk, or an earlier release?
>
> If the trunk, then I suspect you have a stale library hanging
> around. I
> build and run statically on Leopard regularly.
>
>
> On 1/30/08 2:54 PM, "Aurélien Bouteiller" <bouteill_at_[hidden]>
> wrote:
>
>> I get a runtime error in static build on Mac OS 10.5 (automake 1.10,
>> autoconf 2.60, gcc-apple-darwin 4.01, libtool 1.5.22).
>>
>> The error does not occur in dso builds, and everything seems to work
>> fine on Linux.
>>
>> Here is the error log.
>>
>> ~/ompi$ mpirun -np 2 NetPIPE_3.6/NPmpi
>> [grosse-pomme.local:34247] [NO-NAME] ORTE_ERROR_LOG: Bad parameter in
>> file /SourceCache/openmpi/openmpi-5/openmpi/orte/mca/oob/base/
>> oob_base_init.c at line 74
>> [grosse-pomme.local:34247] [NO-NAME] ORTE_ERROR_LOG: Bad parameter in
>> file /SourceCache/openmpi/openmpi-5/openmpi/orte/mca/ns/proxy/
>> ns_proxy_component.c at line 222
>> [grosse-pomme.local:34247] [NO-NAME] ORTE_ERROR_LOG: Error in file /
>> SourceCache/openmpi/openmpi-5/openmpi/orte/runtime/orte_init_stage1.c
>> at line 230
>> --------------------------------------------------------------------------
>> It looks like orte_init failed for some reason; your parallel
>> process is
>> likely to abort. There are many reasons that a parallel process can
>> fail during orte_init; some of which are due to configuration or
>> environment problems. This failure appears to be an internal
>> failure;
>> here's some additional information (which may only be relevant to an
>> Open MPI developer):
>>
>> orte_ns_base_select failed
>> --> Returned value -1 instead of ORTE_SUCCESS
>>
>> --------------------------------------------------------------------------
>> --------------------------------------------------------------------------
>> It looks like MPI_INIT failed for some reason; your parallel
>> process is
>> likely to abort. There are many reasons that a parallel process can
>> fail during MPI_INIT; some of which are due to configuration or
>> environment
>> problems. This failure appears to be an internal failure; here's
>> some
>> additional information (which may only be relevant to an Open MPI
>> developer):
>>
>> ompi_mpi_init: orte_init_stage1 failed
>> --> Returned "Error" (-1) instead of "Success" (0)
>> --------------------------------------------------------------------------
>> *** An error occurred in MPI_Init
>> *** before MPI was initialized
>> *** MPI_ERRORS_ARE_FATAL (goodbye)
>>
>>
>>
>> --
>> Dr. Aurélien Bouteiller
>> Sr. Research Associate - Innovative Computing Laboratory
>> Suite 350, 1122 Volunteer Boulevard
>> Knoxville, TN 37996
>> 865 974 6321
>>
>>
>>
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel