Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] segmentation fault in mpiexec (Linux, Oracle/Sun C)
From: Terry Dontje (terry.dontje_at_[hidden])
Date: 2010-10-20 11:57:32


  Can you remove the -with-threads and -enable-mpi-threads options from
the configure line and see if that helps your 32 bit problem any?

--td
On 10/20/2010 09:38 AM, Siegmar Gross wrote:
> Hi,
>
> I have built Open MPI 1.5 on Linux x86_64 with the Oracle/Sun Studio C
> compiler. Unfortunately "mpiexec" breaks when I run a small propgram.
>
> linpc4 small_prog 106 cc -V
> cc: Sun C 5.10 Linux_i386 2009/06/03
> usage: cc [ options] files. Use 'cc -flags' for details
>
> linpc4 small_prog 107 uname -a
> Linux linpc4 2.6.27.45-0.1-default #1 SMP 2010-02-22 16:49:47 +0100 x86_64
> x86_64 x86_64 GNU/Linux
>
> linpc4 small_prog 108 mpicc -show
> cc -I/usr/local/openmpi-1.5_32_cc/include -mt
> -L/usr/local/openmpi-1.5_32_cc/lib -lmpi -ldl -Wl,--export-dynamic -lnsl
> -lutil -lm -ldl
>
> linpc4 small_prog 109 mpicc -m32 rank_size.c
> linpc4 small_prog 110 mpiexec -np 2 a.out
> I'm process 0 of 2 available processes running on linpc4.
> MPI standard 2.1 is supported.
> I'm process 1 of 2 available processes running on linpc4.
> MPI standard 2.1 is supported.
> [linpc4:11564] *** Process received signal ***
> [linpc4:11564] Signal: Segmentation fault (11)
> [linpc4:11564] Signal code: (128)
> [linpc4:11564] Failing at address: (nil)
> [linpc4:11565] *** Process received signal ***
> [linpc4:11565] Signal: Segmentation fault (11)
> [linpc4:11565] Signal code: (128)
> [linpc4:11565] Failing at address: (nil)
> [linpc4:11564] [ 0] [0xffffe410]
> [linpc4:11564] [ 1] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
> (mca_base_components_close+0x8c) [0xf774ccd0]
> [linpc4:11564] [ 2] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
> (mca_btl_base_close+0xc5) [0xf76bd255]
> [linpc4:11564] [ 3] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
> (mca_bml_base_close+0x32) [0xf76bd112]
> [linpc4:11564] [ 4] /usr/local/openmpi-1.5_32_cc/lib/openmpi/
> mca_pml_ob1.so [0xf73d971f]
> [linpc4:11564] [ 5] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
> (mca_base_components_close+0x8c) [0xf774ccd0]
> [linpc4:11564] [ 6] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
> (mca_pml_base_close+0xc1) [0xf76e4385]
> [linpc4:11564] [ 7] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
> [0xf76889e6]
> [linpc4:11564] [ 8] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
> (PMPI_Finalize+0x3c) [0xf769dd4c]
> [linpc4:11564] [ 9] a.out(main+0x98) [0x8048a18]
> [linpc4:11564] [10] /lib/libc.so.6(__libc_start_main+0xe5) [0xf749c705]
> [linpc4:11564] [11] a.out(_start+0x41) [0x8048861]
> [linpc4:11564] *** End of error message ***
> [linpc4:11565] [ 0] [0xffffe410]
> [linpc4:11565] [ 1] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
> (mca_base_components_close+0x8c) [0xf76bccd0]
> [linpc4:11565] [ 2] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
> (mca_btl_base_close+0xc5) [0xf762d255]
> [linpc4:11565] [ 3] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
> (mca_bml_base_close+0x32) [0xf762d112]
> [linpc4:11565] [ 4] /usr/local/openmpi-1.5_32_cc/lib/openmpi/
> mca_pml_ob1.so [0xf734971f]
> [linpc4:11565] [ 5] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
> (mca_base_components_close+0x8c) [0xf76bccd0]
> [linpc4:11565] [ 6] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
> (mca_pml_base_close+0xc1) [0xf7654385]
> [linpc4:11565] [ 7] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
> [0xf75f89e6]
> [linpc4:11565] [ 8] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
> (PMPI_Finalize+0x3c) [0xf760dd4c]
> [linpc4:11565] [ 9] a.out(main+0x98) [0x8048a18]
> [linpc4:11565] [10] /lib/libc.so.6(__libc_start_main+0xe5) [0xf740c705]
> [linpc4:11565] [11] a.out(_start+0x41) [0x8048861]
> [linpc4:11565] *** End of error message ***
> --------------------------------------------------------------------------
> mpiexec noticed that process rank 0 with PID 11564 on node linpc4 exited
> on signal 11 (Segmentation fault).
> --------------------------------------------------------------------------
> 2 total processes killed (some possibly by mpiexec during cleanup)
> linpc4 small_prog 111
>
>
> "make check" shows that one test failed.
>
> linpc4 openmpi-1.5-Linux.x86_64.32_cc 114 grep FAIL
> log.make-check.Linux.x86_64.32_cc
> FAIL: opal_path_nfs
> linpc4 openmpi-1.5-Linux.x86_64.32_cc 115 grep PASS
> log.make-check.Linux.x86_64.32_cc
> PASS: predefined_gap_test
> PASS: dlopen_test
> PASS: atomic_barrier
> PASS: atomic_barrier_noinline
> PASS: atomic_spinlock
> PASS: atomic_spinlock_noinline
> PASS: atomic_math
> PASS: atomic_math_noinline
> PASS: atomic_cmpset
> PASS: atomic_cmpset_noinline
> decode [PASSED]
> PASS: opal_datatype_test
> PASS: checksum
> PASS: position
> decode [PASSED]
> PASS: ddt_test
> decode [PASSED]
> PASS: ddt_raw
> linpc4 openmpi-1.5-Linux.x86_64.32_cc 116
>
> I used the following command to build the package.
>
> ../openmpi-1.5/configure --prefix=/usr/local/openmpi-1.5_32_cc \
> CFLAGS="-m32" CXXFLAGS="-m32" FFLAGS="-m32" FCFLAGS="-m32" \
> CXXLDFLAGS="-m32" CPPFLAGS="" \
> LDFLAGS="-m32" \
> C_INCL_PATH="" C_INCLUDE_PATH="" CPLUS_INCLUDE_PATH="" \
> OBJC_INCLUDE_PATH="" MPICHHOME="" \
> CC="cc" CXX="CC" F77="f95" FC="f95" \
> --without-udapl --with-threads=posix --enable-mpi-threads \
> --enable-shared --enable-heterogeneous --enable-cxx-exceptions \
> |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.32_cc
>
> I have also built the package with gcc-4.2.0 and it seems to work
> although the nfs-test failed as well. Therefore I'm not sure if
> the failing test is responsible for the failure with the cc-version.
>
> ../openmpi-1.5/configure --prefix=/usr/local/openmpi-1.5_32_gcc \
> CFLAGS="-m32" CXXFLAGS="-m32" FFLAGS="-m32" FCFLAGS="-m32" \
> CXXLDFLAGS="-m32" CPPFLAGS="" \
> LDFLAGS="-m32" \
> C_INCL_PATH="" C_INCLUDE_PATH="" CPLUS_INCLUDE_PATH="" \
> OBJC_INCLUDE_PATH="" MPIHOME="" \
> CC="gcc" CPP="cpp" CXX="g++" CXXCPP="cpp" F77="gfortran" \
> --without-udapl --with-threads=posix --enable-mpi-threads \
> --enable-shared --enable-heterogeneous --enable-cxx-exceptions \
> |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.32_gcc
>
> linpc4 small_prog 107 gcc -v
> Using built-in specs.
> Target: x86_64-unknown-linux-gnu
> Configured with: ../gcc-4.2.0/configure --prefix=/usr/local/gcc-4.2.0
> --enable-languages=c,c++,java,fortran,objc --enable-java-gc=boehm
> --enable-nls --enable-libgcj --enable-threads=posix
> Thread model: posix
> gcc version 4.2.0
>
> linpc4 small_prog 109 mpicc -show
> gcc -I/usr/local/openmpi-1.5_32_gcc/include -fexceptions -pthread
> -L/usr/local/openmpi-1.5_32_gcc/lib -lmpi -ldl -Wl,--export-dynamic
> -lnsl -lutil -lm -ldl
>
> linpc4 small_prog 110 mpicc -m32 rank_size.c
> linpc4 small_prog 111 mpiexec -np 2 a.out
> I'm process 0 of 2 available processes running on linpc4.
> MPI standard 2.1 is supported.
> I'm process 1 of 2 available processes running on linpc4.
> MPI standard 2.1 is supported.
>
> linpc4 small_prog 112 grep FAIL /.../log.make-check.Linux.x86_64.32_gcc
> FAIL: opal_path_nfs
> linpc4 small_prog 113 grep PASS /.../log.make-check.Linux.x86_64.32_gcc
> PASS: predefined_gap_test
> PASS: dlopen_test
> PASS: atomic_barrier
> PASS: atomic_barrier_noinline
> PASS: atomic_spinlock
> PASS: atomic_spinlock_noinline
> PASS: atomic_math
> PASS: atomic_math_noinline
> PASS: atomic_cmpset
> PASS: atomic_cmpset_noinline
> decode [PASSED]
> PASS: opal_datatype_test
> PASS: checksum
> PASS: position
> decode [PASSED]
> PASS: ddt_test
> decode [NOT PASSED]
> PASS: ddt_raw
> linpc4 small_prog 114
>
>
> I used the following small test program.
>
> #include<stdio.h>
> #include<stdlib.h>
> #include "mpi.h"
>
> int main (int argc, char *argv[])
> {
> int ntasks, /* number of parallel tasks */
> mytid, /* my task id */
> version, subversion, /* version of MPI standard */
> namelen; /* length of processor name */
> char processor_name[MPI_MAX_PROCESSOR_NAME];
>
> MPI_Init (&argc,&argv);
> MPI_Comm_rank (MPI_COMM_WORLD,&mytid);
> MPI_Comm_size (MPI_COMM_WORLD,&ntasks);
> MPI_Get_processor_name (processor_name,&namelen);
> printf ("I'm process %d of %d available processes running on %s.\n",
> mytid, ntasks, processor_name);
> MPI_Get_version (&version,&subversion);
> printf ("MPI standard %d.%d is supported.\n", version, subversion);
> MPI_Finalize ();
> return EXIT_SUCCESS;
> }
>
>
> Thank you very much for any help to solve the problem with the
> Oracle/Sun Compiler in advance.
>
>
> Best regards
>
> Siegmar
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle * - Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.dontje_at_[hidden] <mailto:terry.dontje_at_[hidden]>



picture