Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] seg fault with intel compiler
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2012-05-31 16:54:08


This type of error usually means that you are inadvertently mixing versions of Open MPI (e.g., version A.B.C on one node and D.E.F on another node).

Ensure that your paths are setup consistently and that you're getting both the same OMPI tools in your $path and the same libmpi.so in your $LD_LIBRARY_PATH.

On May 31, 2012, at 3:43 PM, Edmund Sumbar wrote:

> Hi,
>
> I feel like a dope. I can't seem to successfully run the following simple test program (from Intel MPI distro) as a Torque batch job on a Cent OS 5.7 cluster with Open MPI 1.6 compiled using Intel compilers 12.1.0.233.
>
> If I comment out MPI_Get_processor_name(), it works.
>
> #include "mpi.h"
> #include <stdio.h>
> #include <string.h>
>
> int
> main (int argc, char *argv[])
> {
> int i, rank, size, namelen;
> char name[MPI_MAX_PROCESSOR_NAME];
> MPI_Status stat;
>
> MPI_Init (&argc, &argv);
>
> MPI_Comm_size (MPI_COMM_WORLD, &size);
> MPI_Comm_rank (MPI_COMM_WORLD, &rank);
> MPI_Get_processor_name (name, &namelen);
>
> if (rank == 0) {
>
> printf ("Hello world: rank %d of %d running on %s\n", rank, size, name);
>
> for (i = 1; i < size; i++) {
> MPI_Recv (&rank, 1, MPI_INT, i, 1, MPI_COMM_WORLD, &stat);
> MPI_Recv (&size, 1, MPI_INT, i, 1, MPI_COMM_WORLD, &stat);
> MPI_Recv (&namelen, 1, MPI_INT, i, 1, MPI_COMM_WORLD, &stat);
> MPI_Recv (name, namelen + 1, MPI_CHAR, i, 1, MPI_COMM_WORLD, &stat);
> printf ("Hello world: rank %d of %d running on %s\n", rank, size, name);
> }
>
> } else {
>
> MPI_Send (&rank, 1, MPI_INT, 0, 1, MPI_COMM_WORLD);
> MPI_Send (&size, 1, MPI_INT, 0, 1, MPI_COMM_WORLD);
> MPI_Send (&namelen, 1, MPI_INT, 0, 1, MPI_COMM_WORLD);
> MPI_Send (name, namelen + 1, MPI_CHAR, 0, 1, MPI_COMM_WORLD);
>
> }
>
> MPI_Finalize ();
>
> return (0);
> }
>
> The result I get is
>
> [cl2n007:19441] *** Process received signal ***
> [cl2n007:19441] Signal: Segmentation fault (11)
> [cl2n007:19441] Signal code: Address not mapped (1)
> [cl2n007:19441] Failing at address: 0x10
> [cl2n007:19441] [ 0] /lib64/libpthread.so.0 [0x306980ebe0]
> [cl2n007:19441] [ 1] /lustre/jasper/software/openmpi/openmpi-1.6-intel/lib/libmpi.so.1(opal_memory_ptmalloc2_int_malloc+0x4b3) [0x2af078563113]
> [cl2n007:19441] [ 2] /lustre/jasper/software/openmpi/openmpi-1.6-intel/lib/libmpi.so.1(opal_memory_ptmalloc2_malloc+0x59) [0x2af0785658a9]
> [cl2n007:19441] [ 3] /lustre/jasper/software/openmpi/openmpi-1.6-intel/lib/libmpi.so.1 [0x2af078565596]
> [cl2n007:19441] [ 4] /lustre/jasper/software/openmpi/openmpi-1.6-intel/lib/libmpi.so.1(opal_class_initialize+0xaa) [0x2af078582faa]
> [cl2n007:19441] [ 5] /lustre/jasper/software/openmpi/openmpi-1.6-intel/lib/openmpi/mca_btl_openib.so [0x2af07c3e1909]
> [cl2n007:19441] [ 6] /lib64/libpthread.so.0 [0x306980677d]
> [cl2n007:19441] [ 7] /lib64/libc.so.6(clone+0x6d) [0x3068cd325d]
> [cl2n007:19441] *** End of error message ***
> [cl2n006:11146] [[51262,0],8] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file util/nidmap.c at line 776
> [cl2n006:11146] [[51262,0],8] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file ess_tm_module.c at line 310
> [cl2n006:11146] [[51262,0],8] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file base/odls_base_default_fns.c at line[cl2n007:19434] [[51262,0],7] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file util/nidmap.c at line 776
> 2342
> [cl2n007:19434] [[51262,0],7] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file ess_tm_module.c at line 310
> [cl2n007:19434] [[51262,0],7] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file base/odls_base_default_fns.c at line 2342
> [cl2n005:13582] [[51262,0],9] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file util/nidmap.c at line 776
> [cl2n005:13582] [[51262,0],9] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file ess_tm_module.c at line 310
> [cl2n005:13582] [[51262,0],9] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file base/odls_base_default_fns.c at line 2342
>
> ...more of the same...
>
>
> $ ompi_info
> Package: Open MPI root_at_[hidden] Distribution
> Open MPI: 1.6
> Open MPI SVN revision: r26429
> Open MPI release date: May 10, 2012
> Open RTE: 1.6
> Open RTE SVN revision: r26429
> Open RTE release date: May 10, 2012
> OPAL: 1.6
> OPAL SVN revision: r26429
> OPAL release date: May 10, 2012
> MPI API: 2.1
> Ident string: 1.6
> Prefix: /lustre/jasper/software/openmpi/openmpi-1.6-intel
> Configured architecture: x86_64-unknown-linux-gnu
> Configure host: jasper.westgrid.ca
> Configured by: root
> Configured on: Wed May 30 13:56:39 MDT 2012
> Configure host: jasper.westgrid.ca
> Built by: root
> Built on: Wed May 30 14:35:10 MDT 2012
> Built host: jasper.westgrid.ca
> C bindings: yes
> C++ bindings: yes
> Fortran77 bindings: yes (all)
> Fortran90 bindings: yes
> Fortran90 bindings size: small
> C compiler: icc
> C compiler absolute: /lustre/jasper/software/intel//l_ics_2012.0.032/composer_xe_2011_sp1.6.233/bin/intel64/icc
> C compiler family name: INTEL
> C compiler version: 9999.20110811
> C++ compiler: icpc
> C++ compiler absolute: /lustre/jasper/software/intel//l_ics_2012.0.032/composer_xe_2011_sp1.6.233/bin/intel64/icpc
> Fortran77 compiler: ifort
> Fortran77 compiler abs: /lustre/jasper/software/intel//l_ics_2012.0.032/composer_xe_2011_sp1.6.233/bin/intel64/ifort
> Fortran90 compiler: ifort
> Fortran90 compiler abs: /lustre/jasper/software/intel//l_ics_2012.0.032/composer_xe_2011_sp1.6.233/bin/intel64/ifort
> C profiling: yes
> C++ profiling: yes
> Fortran77 profiling: yes
> Fortran90 profiling: yes
> C++ exceptions: no
> Thread support: posix (MPI_THREAD_MULTIPLE: no, progress: no)
> Sparse Groups: no
> Internal debug support: no
> MPI interface warnings: no
> MPI parameter check: runtime
> Memory profiling support: no
> Memory debugging support: no
> libltdl support: yes
> Heterogeneous support: no
> mpirun default --prefix: no
> MPI I/O support: yes
> MPI_WTIME support: gettimeofday
> Symbol vis. support: yes
> Host topology support: yes
> MPI extensions: affinity example
> FT Checkpoint support: no (checkpoint thread: no)
> VampirTrace support: yes
> MPI_MAX_PROCESSOR_NAME: 256
> MPI_MAX_ERROR_STRING: 256
> MPI_MAX_OBJECT_NAME: 64
> MPI_MAX_INFO_KEY: 36
> MPI_MAX_INFO_VAL: 256
> MPI_MAX_PORT_NAME: 1024
> MPI_MAX_DATAREP_STRING: 128
> MCA backtrace: execinfo (MCA v2.0, API v2.0, Component v1.6)
> MCA memory: linux (MCA v2.0, API v2.0, Component v1.6)
> MCA paffinity: hwloc (MCA v2.0, API v2.0, Component v1.6)
> MCA carto: auto_detect (MCA v2.0, API v2.0, Component v1.6)
> MCA carto: file (MCA v2.0, API v2.0, Component v1.6)
> MCA shmem: mmap (MCA v2.0, API v2.0, Component v1.6)
> MCA shmem: posix (MCA v2.0, API v2.0, Component v1.6)
> MCA shmem: sysv (MCA v2.0, API v2.0, Component v1.6)
> MCA maffinity: first_use (MCA v2.0, API v2.0, Component v1.6)
> MCA maffinity: hwloc (MCA v2.0, API v2.0, Component v1.6)
> MCA timer: linux (MCA v2.0, API v2.0, Component v1.6)
> MCA installdirs: env (MCA v2.0, API v2.0, Component v1.6)
> MCA installdirs: config (MCA v2.0, API v2.0, Component v1.6)
> MCA sysinfo: linux (MCA v2.0, API v2.0, Component v1.6)
> MCA hwloc: hwloc132 (MCA v2.0, API v2.0, Component v1.6)
> MCA dpm: orte (MCA v2.0, API v2.0, Component v1.6)
> MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.6)
> MCA allocator: basic (MCA v2.0, API v2.0, Component v1.6)
> MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.6)
> MCA coll: basic (MCA v2.0, API v2.0, Component v1.6)
> MCA coll: hierarch (MCA v2.0, API v2.0, Component v1.6)
> MCA coll: inter (MCA v2.0, API v2.0, Component v1.6)
> MCA coll: self (MCA v2.0, API v2.0, Component v1.6)
> MCA coll: sm (MCA v2.0, API v2.0, Component v1.6)
> MCA coll: sync (MCA v2.0, API v2.0, Component v1.6)
> MCA coll: tuned (MCA v2.0, API v2.0, Component v1.6)
> MCA io: romio (MCA v2.0, API v2.0, Component v1.6)
> MCA mpool: fake (MCA v2.0, API v2.0, Component v1.6)
> MCA mpool: rdma (MCA v2.0, API v2.0, Component v1.6)
> MCA mpool: sm (MCA v2.0, API v2.0, Component v1.6)
> MCA pml: bfo (MCA v2.0, API v2.0, Component v1.6)
> MCA pml: csum (MCA v2.0, API v2.0, Component v1.6)
> MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.6)
> MCA pml: v (MCA v2.0, API v2.0, Component v1.6)
> MCA bml: r2 (MCA v2.0, API v2.0, Component v1.6)
> MCA rcache: vma (MCA v2.0, API v2.0, Component v1.6)
> MCA btl: ofud (MCA v2.0, API v2.0, Component v1.6)
> MCA btl: openib (MCA v2.0, API v2.0, Component v1.6)
> MCA btl: self (MCA v2.0, API v2.0, Component v1.6)
> MCA btl: sm (MCA v2.0, API v2.0, Component v1.6)
> MCA btl: tcp (MCA v2.0, API v2.0, Component v1.6)
> MCA topo: unity (MCA v2.0, API v2.0, Component v1.6)
> MCA osc: pt2pt (MCA v2.0, API v2.0, Component v1.6)
> MCA osc: rdma (MCA v2.0, API v2.0, Component v1.6)
> MCA iof: hnp (MCA v2.0, API v2.0, Component v1.6)
> MCA iof: orted (MCA v2.0, API v2.0, Component v1.6)
> MCA iof: tool (MCA v2.0, API v2.0, Component v1.6)
> MCA oob: tcp (MCA v2.0, API v2.0, Component v1.6)
> MCA odls: default (MCA v2.0, API v2.0, Component v1.6)
> MCA ras: cm (MCA v2.0, API v2.0, Component v1.6)
> MCA ras: loadleveler (MCA v2.0, API v2.0, Component v1.6)
> MCA ras: slurm (MCA v2.0, API v2.0, Component v1.6)
> MCA ras: tm (MCA v2.0, API v2.0, Component v1.6)
> MCA rmaps: load_balance (MCA v2.0, API v2.0, Component v1.6)
> MCA rmaps: rank_file (MCA v2.0, API v2.0, Component v1.6)
> MCA rmaps: resilient (MCA v2.0, API v2.0, Component v1.6)
> MCA rmaps: round_robin (MCA v2.0, API v2.0, Component v1.6)
> MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.6)
> MCA rmaps: topo (MCA v2.0, API v2.0, Component v1.6)
> MCA rml: oob (MCA v2.0, API v2.0, Component v1.6)
> MCA routed: binomial (MCA v2.0, API v2.0, Component v1.6)
> MCA routed: cm (MCA v2.0, API v2.0, Component v1.6)
> MCA routed: direct (MCA v2.0, API v2.0, Component v1.6)
> MCA routed: linear (MCA v2.0, API v2.0, Component v1.6)
> MCA routed: radix (MCA v2.0, API v2.0, Component v1.6)
> MCA routed: slave (MCA v2.0, API v2.0, Component v1.6)
> MCA plm: rsh (MCA v2.0, API v2.0, Component v1.6)
> MCA plm: slurm (MCA v2.0, API v2.0, Component v1.6)
> MCA plm: tm (MCA v2.0, API v2.0, Component v1.6)
> MCA filem: rsh (MCA v2.0, API v2.0, Component v1.6)
> MCA errmgr: default (MCA v2.0, API v2.0, Component v1.6)
> MCA ess: env (MCA v2.0, API v2.0, Component v1.6)
> MCA ess: hnp (MCA v2.0, API v2.0, Component v1.6)
> MCA ess: singleton (MCA v2.0, API v2.0, Component v1.6)
> MCA ess: slave (MCA v2.0, API v2.0, Component v1.6)
> MCA ess: slurm (MCA v2.0, API v2.0, Component v1.6)
> MCA ess: slurmd (MCA v2.0, API v2.0, Component v1.6)
> MCA ess: tm (MCA v2.0, API v2.0, Component v1.6)
> MCA ess: tool (MCA v2.0, API v2.0, Component v1.6)
> MCA grpcomm: bad (MCA v2.0, API v2.0, Component v1.6)
> MCA grpcomm: basic (MCA v2.0, API v2.0, Component v1.6)
> MCA grpcomm: hier (MCA v2.0, API v2.0, Component v1.6)
> MCA notifier: command (MCA v2.0, API v1.0, Component v1.6)
> MCA notifier: syslog (MCA v2.0, API v1.0, Component v1.6)
>
>
> --
> Edmund Sumbar
> University of Alberta
> +1 780 492 9360
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/