Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Strange Segmentation Fault inside MPI_Init
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2010-09-11 04:56:59


Is there any chance you can update to Open MPI 1.4.2?

On Sep 11, 2010, at 9:35 AM, Srikanth Raju wrote:

> Hello OMPI Users,
> I'm using OpenMPI 1.4.1 with gcc 4.4.3 on my x86_64 linux system running the latest Ubuntu 10.04 distro. I don't seem to be able to run any OpenMPI application. I try running the simplest application, which goes like this
>
> #include<mpi.h>
> int main(int argc, char * argv[])
> {
> MPI_Init(NULL, NULL);
> MPI_Finalize();
> }
>
> Compiling it with "mpicc -g test.c"
> Running with "mpirun -n 2 -hostfile hosts a.out"
> hosts file contains "localhost slots=2"
> On run, I get this
>
>
> [starbuck:18829] *** Process received signal ***
> [starbuck:18830] *** Process received signal ***
> [starbuck:18830] Signal: Segmentation fault (11)
> [starbuck:18830] Signal code: Address not mapped (1)
> [starbuck:18830] Failing at address: 0x3c
> [starbuck:18829] Signal: Segmentation fault (11)
> [starbuck:18829] Signal code: Address not mapped (1)
> [starbuck:18829] Failing at address: 0x3c
> [starbuck:18830] [ 0] /lib/libpthread.so.0(+0xf8f0) [0x7f3b0aae08f0]
> [starbuck:18830] [ 1] /usr/local/lib/libmca_common_sm.so.1(+0x1561) [0x7f3b082e8561]
> [starbuck:18830] [ 2] /usr/local/lib/libmca_common_sm.so.1(mca_common_sm_mmap_init+0x6c1) [0x7f3b082e9137]
> [starbuck:18830] [ 3] /usr/lib/openmpi/lib/openmpi/mca_mpool_sm.so(+0x137b) [0x7f3b084ed37b]
> [starbuck:18830] [ 4] /usr/lib/libmpi.so.0(mca_mpool_base_module_create+0x7d) [0x7f3b0bacc38d]
> [starbuck:18830] [ 5] /usr/lib/openmpi/lib/openmpi/mca_btl_sm.so(+0x2a38) [0x7f3b06c52a38]
> [starbuck:18830] [ 6] /usr/lib/openmpi/lib/openmpi/mca_bml_r2.so(+0x18e7) [0x7f3b076a48e7]
> [starbuck:18830] [ 7] /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so(+0x258c) [0x7f3b07aae58c]
> [starbuck:18830] [ 8] /usr/lib/libmpi.so.0(+0x392bf) [0x7f3b0ba8b2bf]
> [starbuck:18830] [ 9] /usr/lib/libmpi.so.0(MPI_Init+0x170) [0x7f3b0baac330]
> [starbuck:18830] [10] a.out(main+0x22) [0x400866]
> [starbuck:18830] [11] /lib/libc.so.6(__libc_start_main+0xfd) [0x7f3b0a76cc4d]
> [starbuck:18830] [12] a.out() [0x400789]
> [starbuck:18830] *** End of error message ***
> [starbuck:18829] [ 0] /lib/libpthread.so.0(+0xf8f0) [0x7fb6efefe8f0]
> [starbuck:18829] [ 1] /usr/local/lib/libmca_common_sm.so.1(+0x1561) [0x7fb6ed706561]
> [starbuck:18829] [ 2] /usr/local/lib/libmca_common_sm.so.1(mca_common_sm_mmap_init+0x6c1) [0x7fb6ed707137]
> [starbuck:18829] [ 3] /usr/lib/openmpi/lib/openmpi/mca_mpool_sm.so(+0x137b) [0x7fb6ed90b37b]
> [starbuck:18829] [ 4] /usr/lib/libmpi.so.0(mca_mpool_base_module_create+0x7d) [0x7fb6f0eea38d]
> [starbuck:18829] [ 5] /usr/lib/openmpi/lib/openmpi/mca_btl_sm.so(+0x2a38) [0x7fb6ec070a38]
> [starbuck:18829] [ 6] /usr/lib/openmpi/lib/openmpi/mca_bml_r2.so(+0x18e7) [0x7fb6ecac28e7]
> [starbuck:18829] [ 7] /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so(+0x258c) [0x7fb6ececc58c]
> [starbuck:18829] [ 8] /usr/lib/libmpi.so.0(+0x392bf) [0x7fb6f0ea92bf]
> [starbuck:18829] [ 9] /usr/lib/libmpi.so.0(MPI_Init+0x170) [0x7fb6f0eca330]
> [starbuck:18829] [10] a.out(main+0x22) [0x400866]
> [starbuck:18829] [11] /lib/libc.so.6(__libc_start_main+0xfd) [0x7fb6efb8ac4d]
> [starbuck:18829] [12] a.out() [0x400789]
> [starbuck:18829] *** End of error message ***
> --------------------------------------------------------------------------
> mpirun noticed that process rank 1 with PID 18830 on node starbuck exited on signal 11 (Segmentation fault).
> --------------------------------------------------------------------------
>
> My stack trace from gdb is:
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x00007ffff43c2561 in opal_list_get_first (list=0x7ffff45c5240)
> at ../../../../../opal/class/opal_list.h:201
> 201 assert(1 == item->opal_list_item_refcount);
> (gdb) bt
> #0 0x00007ffff43c2561 in opal_list_get_first (list=0x7ffff45c5240)
> at ../../../../../opal/class/opal_list.h:201
> #1 0x00007ffff43c3137 in mca_common_sm_mmap_init (procs=0x673cb0,
> num_procs=2, size=67113040,
> file_name=0x673c40 "/tmp/openmpi-sessions-srikanth_at_starbuck_0/1510/1/shared_mem_pool.starbuck", size_ctl_structure=4176, data_seg_alignment=8)
> at ../../../../../ompi/mca/common/sm/common_sm_mmap.c:291
> #2 0x00007ffff45c737b in mca_mpool_sm_init (resources=<value optimized out>)
> at ../../../../../../ompi/mca/mpool/sm/mpool_sm_component.c:214
> #3 0x00007ffff7ba638d in mca_mpool_base_module_create ()
> from /usr/lib/libmpi.so.0
> #4 0x00007ffff2d2ca38 in sm_btl_first_time_init (btl=<value optimized out>,
> nprocs=<value optimized out>, procs=<value optimized out>,
> peers=<value optimized out>, reachability=<value optimized out>)
> at ../../../../../../ompi/mca/btl/sm/btl_sm.c:228
> #5 mca_btl_sm_add_procs (btl=<value optimized out>,
> nprocs=<value optimized out>, procs=<value optimized out>,
> peers=<value optimized out>, reachability=<value optimized out>)
> at ../../../../../../ompi/mca/btl/sm/btl_sm.c:500
> #6 0x00007ffff377e8e7 in mca_bml_r2_add_procs (nprocs=<value optimized out>,
> procs=0x2, reachable=0x7fffffffdd00)
> at ../../../../../../ompi/mca/bml/r2/bml_r2.c:206
> #7 0x00007ffff3b8858c in mca_pml_ob1_add_procs (procs=0x678ce0, nprocs=2)
> ---Type <return> to continue, or q <return> to quit---
> at ../../../../../../ompi/mca/pml/ob1/pml_ob1.c:315
> #8 0x00007ffff7b652bf in ?? () from /usr/lib/libmpi.so.0
> #9 0x00007ffff7b86330 in PMPI_Init () from /usr/lib/libmpi.so.0
> #10 0x0000000000400866 in main (argc=1, argv=0x7fffffffe008)
> at test.c:4
>
> I can't figure out what's going on here! It says MPI_Init is segfaulting, but I think it is probably some kind of misconfiguration.
> I have tried reinstalling the openmpi package. I have an AMD Turion X2 M500(64 bit) processor.
>
> The interesting thing is, the Segfault occurs only when I try to run multiple processes. With n = 1, it has no problems.
> Thanks for any help!
>
> --
> Regards,
> Srikanth Raju
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/