Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: [OMPI users] segmentation fault in mpiexec (Linux, Oracle/Sun C)
From: Siegmar Gross (Siegmar.Gross_at_[hidden])
Date: 2010-10-20 09:38:09


Hi,

I have built Open MPI 1.5 on Linux x86_64 with the Oracle/Sun Studio C
compiler. Unfortunately "mpiexec" breaks when I run a small propgram.

linpc4 small_prog 106 cc -V
cc: Sun C 5.10 Linux_i386 2009/06/03
usage: cc [ options] files. Use 'cc -flags' for details

linpc4 small_prog 107 uname -a
Linux linpc4 2.6.27.45-0.1-default #1 SMP 2010-02-22 16:49:47 +0100 x86_64
x86_64 x86_64 GNU/Linux

linpc4 small_prog 108 mpicc -show
cc -I/usr/local/openmpi-1.5_32_cc/include -mt
  -L/usr/local/openmpi-1.5_32_cc/lib -lmpi -ldl -Wl,--export-dynamic -lnsl
  -lutil -lm -ldl

linpc4 small_prog 109 mpicc -m32 rank_size.c
linpc4 small_prog 110 mpiexec -np 2 a.out
I'm process 0 of 2 available processes running on linpc4.
MPI standard 2.1 is supported.
I'm process 1 of 2 available processes running on linpc4.
MPI standard 2.1 is supported.
[linpc4:11564] *** Process received signal ***
[linpc4:11564] Signal: Segmentation fault (11)
[linpc4:11564] Signal code: (128)
[linpc4:11564] Failing at address: (nil)
[linpc4:11565] *** Process received signal ***
[linpc4:11565] Signal: Segmentation fault (11)
[linpc4:11565] Signal code: (128)
[linpc4:11565] Failing at address: (nil)
[linpc4:11564] [ 0] [0xffffe410]
[linpc4:11564] [ 1] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
  (mca_base_components_close+0x8c) [0xf774ccd0]
[linpc4:11564] [ 2] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
  (mca_btl_base_close+0xc5) [0xf76bd255]
[linpc4:11564] [ 3] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
  (mca_bml_base_close+0x32) [0xf76bd112]
[linpc4:11564] [ 4] /usr/local/openmpi-1.5_32_cc/lib/openmpi/
  mca_pml_ob1.so [0xf73d971f]
[linpc4:11564] [ 5] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
  (mca_base_components_close+0x8c) [0xf774ccd0]
[linpc4:11564] [ 6] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
  (mca_pml_base_close+0xc1) [0xf76e4385]
[linpc4:11564] [ 7] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
  [0xf76889e6]
[linpc4:11564] [ 8] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
  (PMPI_Finalize+0x3c) [0xf769dd4c]
[linpc4:11564] [ 9] a.out(main+0x98) [0x8048a18]
[linpc4:11564] [10] /lib/libc.so.6(__libc_start_main+0xe5) [0xf749c705]
[linpc4:11564] [11] a.out(_start+0x41) [0x8048861]
[linpc4:11564] *** End of error message ***
[linpc4:11565] [ 0] [0xffffe410]
[linpc4:11565] [ 1] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
  (mca_base_components_close+0x8c) [0xf76bccd0]
[linpc4:11565] [ 2] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
  (mca_btl_base_close+0xc5) [0xf762d255]
[linpc4:11565] [ 3] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
  (mca_bml_base_close+0x32) [0xf762d112]
[linpc4:11565] [ 4] /usr/local/openmpi-1.5_32_cc/lib/openmpi/
  mca_pml_ob1.so [0xf734971f]
[linpc4:11565] [ 5] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
  (mca_base_components_close+0x8c) [0xf76bccd0]
[linpc4:11565] [ 6] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
  (mca_pml_base_close+0xc1) [0xf7654385]
[linpc4:11565] [ 7] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
  [0xf75f89e6]
[linpc4:11565] [ 8] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
  (PMPI_Finalize+0x3c) [0xf760dd4c]
[linpc4:11565] [ 9] a.out(main+0x98) [0x8048a18]
[linpc4:11565] [10] /lib/libc.so.6(__libc_start_main+0xe5) [0xf740c705]
[linpc4:11565] [11] a.out(_start+0x41) [0x8048861]
[linpc4:11565] *** End of error message ***
--------------------------------------------------------------------------
mpiexec noticed that process rank 0 with PID 11564 on node linpc4 exited
  on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
2 total processes killed (some possibly by mpiexec during cleanup)
linpc4 small_prog 111

"make check" shows that one test failed.

linpc4 openmpi-1.5-Linux.x86_64.32_cc 114 grep FAIL
  log.make-check.Linux.x86_64.32_cc
FAIL: opal_path_nfs
linpc4 openmpi-1.5-Linux.x86_64.32_cc 115 grep PASS
  log.make-check.Linux.x86_64.32_cc
PASS: predefined_gap_test
PASS: dlopen_test
PASS: atomic_barrier
PASS: atomic_barrier_noinline
PASS: atomic_spinlock
PASS: atomic_spinlock_noinline
PASS: atomic_math
PASS: atomic_math_noinline
PASS: atomic_cmpset
PASS: atomic_cmpset_noinline
decode [PASSED]
PASS: opal_datatype_test
PASS: checksum
PASS: position
decode [PASSED]
PASS: ddt_test
decode [PASSED]
PASS: ddt_raw
linpc4 openmpi-1.5-Linux.x86_64.32_cc 116

I used the following command to build the package.

../openmpi-1.5/configure --prefix=/usr/local/openmpi-1.5_32_cc \
  CFLAGS="-m32" CXXFLAGS="-m32" FFLAGS="-m32" FCFLAGS="-m32" \
  CXXLDFLAGS="-m32" CPPFLAGS="" \
  LDFLAGS="-m32" \
  C_INCL_PATH="" C_INCLUDE_PATH="" CPLUS_INCLUDE_PATH="" \
  OBJC_INCLUDE_PATH="" MPICHHOME="" \
  CC="cc" CXX="CC" F77="f95" FC="f95" \
  --without-udapl --with-threads=posix --enable-mpi-threads \
  --enable-shared --enable-heterogeneous --enable-cxx-exceptions \
  |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.32_cc

I have also built the package with gcc-4.2.0 and it seems to work
although the nfs-test failed as well. Therefore I'm not sure if
the failing test is responsible for the failure with the cc-version.

../openmpi-1.5/configure --prefix=/usr/local/openmpi-1.5_32_gcc \
  CFLAGS="-m32" CXXFLAGS="-m32" FFLAGS="-m32" FCFLAGS="-m32" \
  CXXLDFLAGS="-m32" CPPFLAGS="" \
  LDFLAGS="-m32" \
  C_INCL_PATH="" C_INCLUDE_PATH="" CPLUS_INCLUDE_PATH="" \
  OBJC_INCLUDE_PATH="" MPIHOME="" \
  CC="gcc" CPP="cpp" CXX="g++" CXXCPP="cpp" F77="gfortran" \
  --without-udapl --with-threads=posix --enable-mpi-threads \
  --enable-shared --enable-heterogeneous --enable-cxx-exceptions \
  |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.32_gcc

linpc4 small_prog 107 gcc -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc-4.2.0/configure --prefix=/usr/local/gcc-4.2.0
  --enable-languages=c,c++,java,fortran,objc --enable-java-gc=boehm
  --enable-nls --enable-libgcj --enable-threads=posix
Thread model: posix
gcc version 4.2.0

linpc4 small_prog 109 mpicc -show
gcc -I/usr/local/openmpi-1.5_32_gcc/include -fexceptions -pthread
  -L/usr/local/openmpi-1.5_32_gcc/lib -lmpi -ldl -Wl,--export-dynamic
  -lnsl -lutil -lm -ldl

linpc4 small_prog 110 mpicc -m32 rank_size.c
linpc4 small_prog 111 mpiexec -np 2 a.out
I'm process 0 of 2 available processes running on linpc4.
MPI standard 2.1 is supported.
I'm process 1 of 2 available processes running on linpc4.
MPI standard 2.1 is supported.

linpc4 small_prog 112 grep FAIL /.../log.make-check.Linux.x86_64.32_gcc
FAIL: opal_path_nfs
linpc4 small_prog 113 grep PASS /.../log.make-check.Linux.x86_64.32_gcc
PASS: predefined_gap_test
PASS: dlopen_test
PASS: atomic_barrier
PASS: atomic_barrier_noinline
PASS: atomic_spinlock
PASS: atomic_spinlock_noinline
PASS: atomic_math
PASS: atomic_math_noinline
PASS: atomic_cmpset
PASS: atomic_cmpset_noinline
decode [PASSED]
PASS: opal_datatype_test
PASS: checksum
PASS: position
decode [PASSED]
PASS: ddt_test
decode [NOT PASSED]
PASS: ddt_raw
linpc4 small_prog 114

I used the following small test program.

#include <stdio.h>
#include <stdlib.h>
#include "mpi.h"

int main (int argc, char *argv[])
{
  int ntasks, /* number of parallel tasks */
       mytid, /* my task id */
       version, subversion, /* version of MPI standard */
       namelen; /* length of processor name */
  char processor_name[MPI_MAX_PROCESSOR_NAME];

  MPI_Init (&argc, &argv);
  MPI_Comm_rank (MPI_COMM_WORLD, &mytid);
  MPI_Comm_size (MPI_COMM_WORLD, &ntasks);
  MPI_Get_processor_name (processor_name, &namelen);
  printf ("I'm process %d of %d available processes running on %s.\n",
          mytid, ntasks, processor_name);
  MPI_Get_version (&version, &subversion);
  printf ("MPI standard %d.%d is supported.\n", version, subversion);
  MPI_Finalize ();
  return EXIT_SUCCESS;
}

Thank you very much for any help to solve the problem with the
Oracle/Sun Compiler in advance.

Best regards

Siegmar