Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] TIPC BTL Segmentation fault
From: Xin He (xin.i.he_at_[hidden])
Date: 2011-06-29 09:48:27


Hi,

As I advanced in my implementation of TIPC BTL, I added the component
and tried to run hello_c program to test.

Then I got this segmentation fault. It seemed happening after the call
"mca_btl_tipc_add_procs".

The error message displayed:

[oak:23192] *** Process received signal ***
[oak:23192] Signal: Segmentation fault (11)
[oak:23192] Signal code: (128)
[oak:23192] Failing at address: (nil)
[oak:23192] [ 0] /lib/libpthread.so.0(+0xfb40) [0x7fec2a40fb40]
[oak:23192] [ 1] /usr/lib/libmpi.so.0(+0x1e6c10) [0x7fec2b2afc10]
[oak:23192] [ 2] /usr/lib/libmpi.so.0(+0x1e71f2) [0x7fec2b2b01f2]
[oak:23192] [ 3] /usr/lib/openmpi/mca_pml_ob1.so(+0x59f2) [0x7fec264fc9f2]
[oak:23192] [ 4] /usr/lib/openmpi/mca_pml_ob1.so(+0x5e5a) [0x7fec264fce5a]
[oak:23192] [ 5] /usr/lib/openmpi/mca_pml_ob1.so(+0x2386) [0x7fec264f9386]
[oak:23192] [ 6] /usr/lib/openmpi/mca_pml_ob1.so(+0x24a0) [0x7fec264f94a0]
[oak:23192] [ 7] /usr/lib/openmpi/mca_pml_ob1.so(+0x22fb) [0x7fec264f92fb]
[oak:23192] [ 8] /usr/lib/openmpi/mca_pml_ob1.so(+0x3a60) [0x7fec264faa60]
[oak:23192] [ 9] /usr/lib/libmpi.so.0(+0x67f51) [0x7fec2b130f51]
[oak:23192] [10] /usr/lib/libmpi.so.0(MPI_Init+0x173) [0x7fec2b161c33]
[oak:23192] [11] hello_i(main+0x22) [0x400936]
[oak:23192] [12] /lib/libc.so.6(__libc_start_main+0xfe) [0x7fec2a09bd8e]
[oak:23192] [13] hello_i() [0x400859]
[oak:23192] *** End of error message ***

I used gdb to check the stack:
(gdb) bt
#0 0x00007ffff7afac10 in opal_obj_run_constructors (object=0x6ca980)
     at ../opal/class/opal_object.h:427
#1 0x00007ffff7afb1f2 in opal_list_construct (list=0x6ca958) at
class/opal_list.c:88
#2 0x00007ffff2d479f2 in opal_obj_run_constructors (object=0x6ca958)
     at ../../../../opal/class/opal_object.h:427
#3 0x00007ffff2d47e5a in mca_pml_ob1_comm_construct (comm=0x6ca8c0)
     at pml_ob1_comm.c:55
#4 0x00007ffff2d44386 in opal_obj_run_constructors (object=0x6ca8c0)
     at ../../../../opal/class/opal_object.h:427
#5 0x00007ffff2d444a0 in opal_obj_new (cls=0x7ffff2f6c040)
     at ../../../../opal/class/opal_object.h:477
#6 0x00007ffff2d442fb in opal_obj_new_debug (type=0x7ffff2f6c040,
     file=0x7ffff2d62840 "pml_ob1.c", line=182)
     at ../../../../opal/class/opal_object.h:252
#7 0x00007ffff2d45a60 in mca_pml_ob1_add_comm (comm=0x601060) at
pml_ob1.c:182
#8 0x00007ffff797bf51 in ompi_mpi_init (argc=1, argv=0x7fffffffdf58,
requested=0,
     provided=0x7fffffffde28) at runtime/ompi_mpi_init.c:770
#9 0x00007ffff79acc33 in PMPI_Init (argc=0x7fffffffde5c,
argv=0x7fffffffde50)
     at pinit.c:84
#10 0x0000000000400936 in main (argc=1, argv=0x7fffffffdf58) at hello_c.c:17

It seems the error happened when an object is constructed. Any idea why
this is happening?

Thanks.

Best regards,
Xin