Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Andrew Friedley (afriedle_at_[hidden])
Date: 2005-10-21 08:04:56


I've managed to reproduce the segfault, but haven't yet figured out the
problem. I've got some distractions to attend to this afternoon, so it
might be a while before I get a fix.

Andrew

Troy Benjegerdes wrote:
> I just did a fresh build from the v1.0 branch. I just ran this from the
> command line.. I guess I was hopeing it was going to default to ssh to
> start things up.
>
> I also built this as a vpath build.... Does anyone else regularly build
> like that? It seems to at least confuse paths in gdb.
>
> More to come later..
>
> On Fri, Oct 21, 2005 at 09:31:57AM -0500, Jeff Squyres wrote:
>> Blah -- this is a segv when trying to print a help message. The help
>> message you should have gotten was:
>>
>> -----
>> It looks like orte_init failed for some reason; your parallel process is
>> likely to abort. There are many reasons that a parallel process can
>> fail during orte_init; some of which are due to configuration or
>> environment problems. This failure appears to be an internal failure;
>> here's some additional information (which may only be relevant to an
>> Open MPI developer):
>>
>> orte_sds_base_select failed
>> --> Returned value ?? instead of ORTE_SUCCESS
>> -----
>>
>> I'll look into why this happened (segv instead of printing the message).
>>
>> However, the real issue is why you got this error in the first place.
>> What version of OMPI were you running (a nightly tarball, an rc tarball,
>> etc.)? What run-time environment were you using -- a batch scheduler or
>> simple rsh/ssh? Can you send the information listed in
>> http://www.open-mpi.org/community/help/ ?
>>
>>
>>
>> On Thu, 2005-10-20 at 22:43 -0500, Troy Benjegerdes wrote:
>>> Anyone know what's up here?
>>>
>>> troy_at_opteron1:~$ mpirun -np 2 hostname
>>> [opteron1.scl.ameslab.gov:01865] [NO-NAME] ORTE_ERROR_LOG: Not found in
>>> file ../../../ompi-svn_v1.0/orte/runtime/orte_init_stage1.c at line 212
>>> Segmentation fault
>>> troy_at_opteron1:~$ gdb
>>> -bash: gdb: command not found
>>> troy_at_opteron1:~$ gdb mpirun
>>> GNU gdb 6.3-debian
>>> Copyright 2004 Free Software Foundation, Inc.
>>> GDB is free software, covered by the GNU General Public License, and you
>>> are
>>> welcome to change it and/or distribute copies of it under certain
>>> conditions.
>>> Type "show copying" to see the conditions.
>>> There is absolutely no warranty for GDB. Type "show warranty" for
>>> details.
>>> This GDB was configured as "x86_64-linux"...Using host libthread_db
>>> library "/lib/libthread_db.so.1".
>>>
>>> (gdb) run -np 2 hostname
>>> Starting program: /usr/local/bin/mpirun -np 2 hostname
>>> [Thread debugging using libthread_db enabled]
>>> [New Thread 46912509168352 (LWP 7636)]
>>> [opteron1.scl.ameslab.gov:07636] [NO-NAME] ORTE_ERROR_LOG: Not found in
>>> file ../../../ompi-svn_v1.0/orte/runtime/orte_init_stage1.c at line 212
>>>
>>> Program received signal SIGSEGV, Segmentation fault.
>>> [Switching to Thread 46912509168352 (LWP 7636)]
>>> 0x00002aaaab3279d0 in strlen () from /lib/libc.so.6
>>> (gdb) bt
>>> #0 0x00002aaaab3279d0 in strlen () from /lib/libc.so.6
>>> #1 0x00002aaaab2fa158 in vfprintf () from /lib/libc.so.6
>>> #2 0x00002aaaab31931d in vasprintf () from /lib/libc.so.6
>>> #3 0x00002aaaab50b150 in output () from /usr/local/lib/libopal.so.0
>>> #4 0x00002aaaab50ae14 in opal_show_help () from
>>> /usr/local/lib/libopal.so.0
>>> #5 0x00002aaaaabd2a8d in orte_init_stage1 () from
>>> /usr/local/lib/liborte.so.0
>>> #6 0x00002aaaaabd594a in orte_system_init () from
>>> /usr/local/lib/liborte.so.0
>>> #7 0x00002aaaaabd2969 in orte_init () from /usr/local/lib/liborte.so.0
>>> #8 0x00000000004021d3 in orterun (argc=4, argv=0x7fffffd242a8)
>>> at ../../../../ompi-svn_v1.0/orte/tools/orterun/orterun.c:294
>>> #9 0x0000000000401f93 in main (argc=4, argv=0x7fffffd242a8)
>>> at ../../../../ompi-svn_v1.0/orte/tools/orterun/main.c:13
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> --
>> {+} Jeff Squyres
>> {+} The Open MPI Project
>> {+} http://www.open-mpi.org/
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>