Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] segmentation fault in mpiexec (Linux, Oracle/Sun C)
From: Ralph Castain (rhc_at_[hidden])
Date: 2010-10-20 10:04:39


Just to be clear: it isn't mpiexec that is failing. It is your MPI application processes that are failing.

On Oct 20, 2010, at 7:38 AM, Siegmar Gross wrote:

> Hi,
>
> I have built Open MPI 1.5 on Linux x86_64 with the Oracle/Sun Studio C
> compiler. Unfortunately "mpiexec" breaks when I run a small propgram.
>
> linpc4 small_prog 106 cc -V
> cc: Sun C 5.10 Linux_i386 2009/06/03
> usage: cc [ options] files. Use 'cc -flags' for details
>
> linpc4 small_prog 107 uname -a
> Linux linpc4 2.6.27.45-0.1-default #1 SMP 2010-02-22 16:49:47 +0100 x86_64
> x86_64 x86_64 GNU/Linux
>
> linpc4 small_prog 108 mpicc -show
> cc -I/usr/local/openmpi-1.5_32_cc/include -mt
> -L/usr/local/openmpi-1.5_32_cc/lib -lmpi -ldl -Wl,--export-dynamic -lnsl
> -lutil -lm -ldl
>
> linpc4 small_prog 109 mpicc -m32 rank_size.c
> linpc4 small_prog 110 mpiexec -np 2 a.out
> I'm process 0 of 2 available processes running on linpc4.
> MPI standard 2.1 is supported.
> I'm process 1 of 2 available processes running on linpc4.
> MPI standard 2.1 is supported.
> [linpc4:11564] *** Process received signal ***
> [linpc4:11564] Signal: Segmentation fault (11)
> [linpc4:11564] Signal code: (128)
> [linpc4:11564] Failing at address: (nil)
> [linpc4:11565] *** Process received signal ***
> [linpc4:11565] Signal: Segmentation fault (11)
> [linpc4:11565] Signal code: (128)
> [linpc4:11565] Failing at address: (nil)
> [linpc4:11564] [ 0] [0xffffe410]
> [linpc4:11564] [ 1] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
> (mca_base_components_close+0x8c) [0xf774ccd0]
> [linpc4:11564] [ 2] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
> (mca_btl_base_close+0xc5) [0xf76bd255]
> [linpc4:11564] [ 3] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
> (mca_bml_base_close+0x32) [0xf76bd112]
> [linpc4:11564] [ 4] /usr/local/openmpi-1.5_32_cc/lib/openmpi/
> mca_pml_ob1.so [0xf73d971f]
> [linpc4:11564] [ 5] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
> (mca_base_components_close+0x8c) [0xf774ccd0]
> [linpc4:11564] [ 6] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
> (mca_pml_base_close+0xc1) [0xf76e4385]
> [linpc4:11564] [ 7] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
> [0xf76889e6]
> [linpc4:11564] [ 8] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
> (PMPI_Finalize+0x3c) [0xf769dd4c]
> [linpc4:11564] [ 9] a.out(main+0x98) [0x8048a18]
> [linpc4:11564] [10] /lib/libc.so.6(__libc_start_main+0xe5) [0xf749c705]
> [linpc4:11564] [11] a.out(_start+0x41) [0x8048861]
> [linpc4:11564] *** End of error message ***
> [linpc4:11565] [ 0] [0xffffe410]
> [linpc4:11565] [ 1] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
> (mca_base_components_close+0x8c) [0xf76bccd0]
> [linpc4:11565] [ 2] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
> (mca_btl_base_close+0xc5) [0xf762d255]
> [linpc4:11565] [ 3] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
> (mca_bml_base_close+0x32) [0xf762d112]
> [linpc4:11565] [ 4] /usr/local/openmpi-1.5_32_cc/lib/openmpi/
> mca_pml_ob1.so [0xf734971f]
> [linpc4:11565] [ 5] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
> (mca_base_components_close+0x8c) [0xf76bccd0]
> [linpc4:11565] [ 6] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
> (mca_pml_base_close+0xc1) [0xf7654385]
> [linpc4:11565] [ 7] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
> [0xf75f89e6]
> [linpc4:11565] [ 8] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
> (PMPI_Finalize+0x3c) [0xf760dd4c]
> [linpc4:11565] [ 9] a.out(main+0x98) [0x8048a18]
> [linpc4:11565] [10] /lib/libc.so.6(__libc_start_main+0xe5) [0xf740c705]
> [linpc4:11565] [11] a.out(_start+0x41) [0x8048861]
> [linpc4:11565] *** End of error message ***
> --------------------------------------------------------------------------
> mpiexec noticed that process rank 0 with PID 11564 on node linpc4 exited
> on signal 11 (Segmentation fault).
> --------------------------------------------------------------------------
> 2 total processes killed (some possibly by mpiexec during cleanup)
> linpc4 small_prog 111
>
>
> "make check" shows that one test failed.
>
> linpc4 openmpi-1.5-Linux.x86_64.32_cc 114 grep FAIL
> log.make-check.Linux.x86_64.32_cc
> FAIL: opal_path_nfs
> linpc4 openmpi-1.5-Linux.x86_64.32_cc 115 grep PASS
> log.make-check.Linux.x86_64.32_cc
> PASS: predefined_gap_test
> PASS: dlopen_test
> PASS: atomic_barrier
> PASS: atomic_barrier_noinline
> PASS: atomic_spinlock
> PASS: atomic_spinlock_noinline
> PASS: atomic_math
> PASS: atomic_math_noinline
> PASS: atomic_cmpset
> PASS: atomic_cmpset_noinline
> decode [PASSED]
> PASS: opal_datatype_test
> PASS: checksum
> PASS: position
> decode [PASSED]
> PASS: ddt_test
> decode [PASSED]
> PASS: ddt_raw
> linpc4 openmpi-1.5-Linux.x86_64.32_cc 116
>
> I used the following command to build the package.
>
> ../openmpi-1.5/configure --prefix=/usr/local/openmpi-1.5_32_cc \
> CFLAGS="-m32" CXXFLAGS="-m32" FFLAGS="-m32" FCFLAGS="-m32" \
> CXXLDFLAGS="-m32" CPPFLAGS="" \
> LDFLAGS="-m32" \
> C_INCL_PATH="" C_INCLUDE_PATH="" CPLUS_INCLUDE_PATH="" \
> OBJC_INCLUDE_PATH="" MPICHHOME="" \
> CC="cc" CXX="CC" F77="f95" FC="f95" \
> --without-udapl --with-threads=posix --enable-mpi-threads \
> --enable-shared --enable-heterogeneous --enable-cxx-exceptions \
> |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.32_cc
>
> I have also built the package with gcc-4.2.0 and it seems to work
> although the nfs-test failed as well. Therefore I'm not sure if
> the failing test is responsible for the failure with the cc-version.
>
> ../openmpi-1.5/configure --prefix=/usr/local/openmpi-1.5_32_gcc \
> CFLAGS="-m32" CXXFLAGS="-m32" FFLAGS="-m32" FCFLAGS="-m32" \
> CXXLDFLAGS="-m32" CPPFLAGS="" \
> LDFLAGS="-m32" \
> C_INCL_PATH="" C_INCLUDE_PATH="" CPLUS_INCLUDE_PATH="" \
> OBJC_INCLUDE_PATH="" MPIHOME="" \
> CC="gcc" CPP="cpp" CXX="g++" CXXCPP="cpp" F77="gfortran" \
> --without-udapl --with-threads=posix --enable-mpi-threads \
> --enable-shared --enable-heterogeneous --enable-cxx-exceptions \
> |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.32_gcc
>
> linpc4 small_prog 107 gcc -v
> Using built-in specs.
> Target: x86_64-unknown-linux-gnu
> Configured with: ../gcc-4.2.0/configure --prefix=/usr/local/gcc-4.2.0
> --enable-languages=c,c++,java,fortran,objc --enable-java-gc=boehm
> --enable-nls --enable-libgcj --enable-threads=posix
> Thread model: posix
> gcc version 4.2.0
>
> linpc4 small_prog 109 mpicc -show
> gcc -I/usr/local/openmpi-1.5_32_gcc/include -fexceptions -pthread
> -L/usr/local/openmpi-1.5_32_gcc/lib -lmpi -ldl -Wl,--export-dynamic
> -lnsl -lutil -lm -ldl
>
> linpc4 small_prog 110 mpicc -m32 rank_size.c
> linpc4 small_prog 111 mpiexec -np 2 a.out
> I'm process 0 of 2 available processes running on linpc4.
> MPI standard 2.1 is supported.
> I'm process 1 of 2 available processes running on linpc4.
> MPI standard 2.1 is supported.
>
> linpc4 small_prog 112 grep FAIL /.../log.make-check.Linux.x86_64.32_gcc
> FAIL: opal_path_nfs
> linpc4 small_prog 113 grep PASS /.../log.make-check.Linux.x86_64.32_gcc
> PASS: predefined_gap_test
> PASS: dlopen_test
> PASS: atomic_barrier
> PASS: atomic_barrier_noinline
> PASS: atomic_spinlock
> PASS: atomic_spinlock_noinline
> PASS: atomic_math
> PASS: atomic_math_noinline
> PASS: atomic_cmpset
> PASS: atomic_cmpset_noinline
> decode [PASSED]
> PASS: opal_datatype_test
> PASS: checksum
> PASS: position
> decode [PASSED]
> PASS: ddt_test
> decode [NOT PASSED]
> PASS: ddt_raw
> linpc4 small_prog 114
>
>
> I used the following small test program.
>
> #include <stdio.h>
> #include <stdlib.h>
> #include "mpi.h"
>
> int main (int argc, char *argv[])
> {
> int ntasks, /* number of parallel tasks */
> mytid, /* my task id */
> version, subversion, /* version of MPI standard */
> namelen; /* length of processor name */
> char processor_name[MPI_MAX_PROCESSOR_NAME];
>
> MPI_Init (&argc, &argv);
> MPI_Comm_rank (MPI_COMM_WORLD, &mytid);
> MPI_Comm_size (MPI_COMM_WORLD, &ntasks);
> MPI_Get_processor_name (processor_name, &namelen);
> printf ("I'm process %d of %d available processes running on %s.\n",
> mytid, ntasks, processor_name);
> MPI_Get_version (&version, &subversion);
> printf ("MPI standard %d.%d is supported.\n", version, subversion);
> MPI_Finalize ();
> return EXIT_SUCCESS;
> }
>
>
> Thank you very much for any help to solve the problem with the
> Oracle/Sun Compiler in advance.
>
>
> Best regards
>
> Siegmar
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users