Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] Strange Segmentation Fault inside MPI_Init
From: Ralph Castain (rhc_at_[hidden])
Date: 2010-09-11 10:57:23


How did you configure OMPI?

On Sep 11, 2010, at 1:35 AM, Srikanth Raju wrote:

> Hello OMPI Users,
> I'm using OpenMPI 1.4.1 with gcc 4.4.3 on my x86_64 linux system running the latest Ubuntu 10.04 distro. I don't seem to be able to run any OpenMPI application. I try running the simplest application, which goes like this
>
> #include<mpi.h>
> int main(int argc, char * argv[])
> {
> MPI_Init(NULL, NULL);
> MPI_Finalize();
> }
>
> Compiling it with "mpicc -g test.c"
> Running with "mpirun -n 2 -hostfile hosts a.out"
> hosts file contains "localhost slots=2"
> On run, I get this
>
>
> [starbuck:18829] *** Process received signal ***
> [starbuck:18830] *** Process received signal ***
> [starbuck:18830] Signal: Segmentation fault (11)
> [starbuck:18830] Signal code: Address not mapped (1)
> [starbuck:18830] Failing at address: 0x3c
> [starbuck:18829] Signal: Segmentation fault (11)
> [starbuck:18829] Signal code: Address not mapped (1)
> [starbuck:18829] Failing at address: 0x3c
> [starbuck:18830] [ 0] /lib/libpthread.so.0(+0xf8f0) [0x7f3b0aae08f0]
> [starbuck:18830] [ 1] /usr/local/lib/libmca_common_sm.so.1(+0x1561) [0x7f3b082e8561]
> [starbuck:18830] [ 2] /usr/local/lib/libmca_common_sm.so.1(mca_common_sm_mmap_init+0x6c1) [0x7f3b082e9137]
> [starbuck:18830] [ 3] /usr/lib/openmpi/lib/openmpi/mca_mpool_sm.so(+0x137b) [0x7f3b084ed37b]
> [starbuck:18830] [ 4] /usr/lib/libmpi.so.0(mca_mpool_base_module_create+0x7d) [0x7f3b0bacc38d]
> [starbuck:18830] [ 5] /usr/lib/openmpi/lib/openmpi/mca_btl_sm.so(+0x2a38) [0x7f3b06c52a38]
> [starbuck:18830] [ 6] /usr/lib/openmpi/lib/openmpi/mca_bml_r2.so(+0x18e7) [0x7f3b076a48e7]
> [starbuck:18830] [ 7] /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so(+0x258c) [0x7f3b07aae58c]
> [starbuck:18830] [ 8] /usr/lib/libmpi.so.0(+0x392bf) [0x7f3b0ba8b2bf]
> [starbuck:18830] [ 9] /usr/lib/libmpi.so.0(MPI_Init+0x170) [0x7f3b0baac330]
> [starbuck:18830] [10] a.out(main+0x22) [0x400866]
> [starbuck:18830] [11] /lib/libc.so.6(__libc_start_main+0xfd) [0x7f3b0a76cc4d]
> [starbuck:18830] [12] a.out() [0x400789]
> [starbuck:18830] *** End of error message ***
> [starbuck:18829] [ 0] /lib/libpthread.so.0(+0xf8f0) [0x7fb6efefe8f0]
> [starbuck:18829] [ 1] /usr/local/lib/libmca_common_sm.so.1(+0x1561) [0x7fb6ed706561]
> [starbuck:18829] [ 2] /usr/local/lib/libmca_common_sm.so.1(mca_common_sm_mmap_init+0x6c1) [0x7fb6ed707137]
> [starbuck:18829] [ 3] /usr/lib/openmpi/lib/openmpi/mca_mpool_sm.so(+0x137b) [0x7fb6ed90b37b]
> [starbuck:18829] [ 4] /usr/lib/libmpi.so.0(mca_mpool_base_module_create+0x7d) [0x7fb6f0eea38d]
> [starbuck:18829] [ 5] /usr/lib/openmpi/lib/openmpi/mca_btl_sm.so(+0x2a38) [0x7fb6ec070a38]
> [starbuck:18829] [ 6] /usr/lib/openmpi/lib/openmpi/mca_bml_r2.so(+0x18e7) [0x7fb6ecac28e7]
> [starbuck:18829] [ 7] /usr/lib/openmpi/lib/openmpi/mca_pml_ob1.so(+0x258c) [0x7fb6ececc58c]
> [starbuck:18829] [ 8] /usr/lib/libmpi.so.0(+0x392bf) [0x7fb6f0ea92bf]
> [starbuck:18829] [ 9] /usr/lib/libmpi.so.0(MPI_Init+0x170) [0x7fb6f0eca330]
> [starbuck:18829] [10] a.out(main+0x22) [0x400866]
> [starbuck:18829] [11] /lib/libc.so.6(__libc_start_main+0xfd) [0x7fb6efb8ac4d]
> [starbuck:18829] [12] a.out() [0x400789]
> [starbuck:18829] *** End of error message ***
> --------------------------------------------------------------------------
> mpirun noticed that process rank 1 with PID 18830 on node starbuck exited on signal 11 (Segmentation fault).
> --------------------------------------------------------------------------
>
> My stack trace from gdb is:
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x00007ffff43c2561 in opal_list_get_first (list=0x7ffff45c5240)
> at ../../../../../opal/class/opal_list.h:201
> 201 assert(1 == item->opal_list_item_refcount);
> (gdb) bt
> #0 0x00007ffff43c2561 in opal_list_get_first (list=0x7ffff45c5240)
> at ../../../../../opal/class/opal_list.h:201
> #1 0x00007ffff43c3137 in mca_common_sm_mmap_init (procs=0x673cb0,
> num_procs=2, size=67113040,
> file_name=0x673c40 "/tmp/openmpi-sessions-srikanth_at_starbuck_0/1510/1/shared_mem_pool.starbuck", size_ctl_structure=4176, data_seg_alignment=8)
> at ../../../../../ompi/mca/common/sm/common_sm_mmap.c:291
> #2 0x00007ffff45c737b in mca_mpool_sm_init (resources=<value optimized out>)
> at ../../../../../../ompi/mca/mpool/sm/mpool_sm_component.c:214
> #3 0x00007ffff7ba638d in mca_mpool_base_module_create ()
> from /usr/lib/libmpi.so.0
> #4 0x00007ffff2d2ca38 in sm_btl_first_time_init (btl=<value optimized out>,
> nprocs=<value optimized out>, procs=<value optimized out>,
> peers=<value optimized out>, reachability=<value optimized out>)
> at ../../../../../../ompi/mca/btl/sm/btl_sm.c:228
> #5 mca_btl_sm_add_procs (btl=<value optimized out>,
> nprocs=<value optimized out>, procs=<value optimized out>,
> peers=<value optimized out>, reachability=<value optimized out>)
> at ../../../../../../ompi/mca/btl/sm/btl_sm.c:500
> #6 0x00007ffff377e8e7 in mca_bml_r2_add_procs (nprocs=<value optimized out>,
> procs=0x2, reachable=0x7fffffffdd00)
> at ../../../../../../ompi/mca/bml/r2/bml_r2.c:206
> #7 0x00007ffff3b8858c in mca_pml_ob1_add_procs (procs=0x678ce0, nprocs=2)
> ---Type <return> to continue, or q <return> to quit---
> at ../../../../../../ompi/mca/pml/ob1/pml_ob1.c:315
> #8 0x00007ffff7b652bf in ?? () from /usr/lib/libmpi.so.0
> #9 0x00007ffff7b86330 in PMPI_Init () from /usr/lib/libmpi.so.0
> #10 0x0000000000400866 in main (argc=1, argv=0x7fffffffe008)
> at test.c:4
>
> I can't figure out what's going on here! It says MPI_Init is segfaulting, but I think it is probably some kind of misconfiguration.
> I have tried reinstalling the openmpi package. I have an AMD Turion X2 M500(64 bit) processor.
>
> The interesting thing is, the Segfault occurs only when I try to run multiple processes. With n = 1, it has no problems.
> Thanks for any help!
>
> --
> Regards,
> Srikanth Raju
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users