Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] orte_ns_base_select failed: returned value -1 instead of ORTE_SUCCESS
From: Ralph Castain (rhc_at_[hidden])
Date: 2008-01-31 17:58:09


Hmmm...well, my bad. There does indeed appear to be something funny going on
with Leopard. No idea what - it used to work fine. I haven't tested it in
awhile though - I've been test building regularly on Leopard, but running on
Tiger (I misspoke earlier).

For now, I'm afraid you can't run on Leopard. Have to figure it out later
when I have more time.

Ralph

> ------ Forwarded Message
>> From: Aurélien Bouteiller <bouteill_at_[hidden]>
>> Reply-To: Open MPI Developers <devel_at_[hidden]>
>> Date: Thu, 31 Jan 2008 02:18:27 -0500
>> To: Open MPI Developers <devel_at_[hidden]>
>> Subject: Re: [OMPI devel] orte_ns_base_select failed: returned value -1
>> instead of ORTE_SUCCESS
>>
>> I tried using a fresh trunk, same problem have occured. Here is the
>> complete configure line. I am using libtool 1.5.22 from fink.
>> Otherwise everything is standard OS 10.5.
>>
>> $ ../trunk/configure --prefix=/Users/bouteill/ompi/build --enable-
>> mpirun-prefix-by-default --disable-io-romio --enable-debug --enable-
>> picky --enable-mem-debug --enable-mem-profile --enable-visibility --
>> disable-dlopen --disable-shared --enable-static
>>
>> The error message generated by abort contains garbage (line numbers do
>> not match anything in .c files and according to gdb the failure does
>> not occur during ns initialization). This looks like a heap corruption
>> or something as bad.
>>
>> orterun (argc=4, argv=0xbffff81c) at ../../../../trunk/orte/tools/
>> orterun/orterun.c:529
>> 529 cb_states = ORTE_PROC_STATE_TERMINATED |
>> ORTE_PROC_STATE_AT_STG1;
>> (gdb) n
>> 530 rc = orte_rmgr.spawn_job(apps, num_apps, &jobid, 0, NULL,
>> job_state_callback, cb_states, &attributes);
>> (gdb) n
>> 531 while (NULL != (item = opal_list_remove_first(&attributes)))
>> OBJ_RELEASE(item);
>> (gdb) n
>> ** Stepping over inlined function code. **
>> 532 OBJ_DESTRUCT(&attributes);
>> (gdb) n
>> 534 if (orterun_globals.do_not_launch) {
>> (gdb) n
>> 539 OPAL_THREAD_LOCK(&orterun_globals.lock);
>> (gdb) n
>> 541 if (ORTE_SUCCESS == rc) {
>> (gdb) n
>> 542 while (!orterun_globals.exit) {
>> (gdb) n
>> 543 opal_condition_wait(&orterun_globals.cond,
>> (gdb) n
>> [grosse-pomme.local:77335] [NO-NAME] ORTE_ERROR_LOG: Bad parameter in
>> file /SourceCache/openmpi/openmpi-5/openmpi/orte/mca/oob/base/
>> oob_base_init.c at line 74
>>
>> Aurelien
>>
>>
>> Le 30 janv. 08 à 17:18, Ralph Castain a écrit :
>>
>>> Are you running on the trunk, or an earlier release?
>>>
>>> If the trunk, then I suspect you have a stale library hanging
>>> around. I
>>> build and run statically on Leopard regularly.
>>>
>>>
>>> On 1/30/08 2:54 PM, "Aurélien Bouteiller" <bouteill_at_[hidden]>
>>> wrote:
>>>
>>>> I get a runtime error in static build on Mac OS 10.5 (automake 1.10,
>>>> autoconf 2.60, gcc-apple-darwin 4.01, libtool 1.5.22).
>>>>
>>>> The error does not occur in dso builds, and everything seems to work
>>>> fine on Linux.
>>>>
>>>> Here is the error log.
>>>>
>>>> ~/ompi$ mpirun -np 2 NetPIPE_3.6/NPmpi
>>>> [grosse-pomme.local:34247] [NO-NAME] ORTE_ERROR_LOG: Bad parameter in
>>>> file /SourceCache/openmpi/openmpi-5/openmpi/orte/mca/oob/base/
>>>> oob_base_init.c at line 74
>>>> [grosse-pomme.local:34247] [NO-NAME] ORTE_ERROR_LOG: Bad parameter in
>>>> file /SourceCache/openmpi/openmpi-5/openmpi/orte/mca/ns/proxy/
>>>> ns_proxy_component.c at line 222
>>>> [grosse-pomme.local:34247] [NO-NAME] ORTE_ERROR_LOG: Error in file /
>>>> SourceCache/openmpi/openmpi-5/openmpi/orte/runtime/orte_init_stage1.c
>>>> at line 230
>>>> --------------------------------------------------------------------------
>>>> It looks like orte_init failed for some reason; your parallel
>>>> process is
>>>> likely to abort. There are many reasons that a parallel process can
>>>> fail during orte_init; some of which are due to configuration or
>>>> environment problems. This failure appears to be an internal
>>>> failure;
>>>> here's some additional information (which may only be relevant to an
>>>> Open MPI developer):
>>>>
>>>> orte_ns_base_select failed
>>>> --> Returned value -1 instead of ORTE_SUCCESS
>>>>
>>>> --------------------------------------------------------------------------
>>>> --------------------------------------------------------------------------
>>>> It looks like MPI_INIT failed for some reason; your parallel
>>>> process is
>>>> likely to abort. There are many reasons that a parallel process can
>>>> fail during MPI_INIT; some of which are due to configuration or
>>>> environment
>>>> problems. This failure appears to be an internal failure; here's
>>>> some
>>>> additional information (which may only be relevant to an Open MPI
>>>> developer):
>>>>
>>>> ompi_mpi_init: orte_init_stage1 failed
>>>> --> Returned "Error" (-1) instead of "Success" (0)
>>>> --------------------------------------------------------------------------
>>>> *** An error occurred in MPI_Init
>>>> *** before MPI was initialized
>>>> *** MPI_ERRORS_ARE_FATAL (goodbye)
>>>>
>>>>
>>>>
>>>> --
>>>> Dr. Aurélien Bouteiller
>>>> Sr. Research Associate - Innovative Computing Laboratory
>>>> Suite 350, 1122 Volunteer Boulevard
>>>> Knoxville, TN 37996
>>>> 865 974 6321
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> ------ End of Forwarded Message
>
>
>
> ------ End of Forwarded Message
>
>