Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] SegFault with MPI_THREAD_MULTIPLE in 1.2.4
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2007-11-28 11:27:51


This is to be expected. OMPI's support for THREAD_MULTIPLE is
incomplete and most likely doesn't work.

On Nov 25, 2007, at 6:45 PM, Emilio J. Padron wrote:

> Hi,
>
> it's my fist message here so greetings to everyone (and sorry about my
> poor english) :-)
>
> I'm coding a parallel algorithm and I've decided to upgrade the
> openmpi
> version used in our cluster (1.2.3) this week. After that, problems
> arise :-/
>
> There seems to be any problem with multithreding support in OpenMPI
> 1.2.4, at
> least in my installation. Problem appears when more than one process
> per node is spawned. A simple *hello world* program (with no snd/
> rcvs) works
> ok in MPI_THREAD_SINGLE mode, but when I tried MPI_THREAD_MULTIPLE
> this
> error arises:
>
> /opt/openmpi/bin/mpirun -np 2 -machinefile /home/users/emilioj/
> machinefileOpenMPI --debug-daemons justhi
> Daemon [0,0,1] checking in as pid 5446 on host c0-0
> [pvfs2-compute-0-0.local:05446] [0,0,1] orted: received launch
> callback
> [pvfs2-compute-0-0:05447] *** Process received signal ***
> [pvfs2-compute-0-0:05447] Signal: Segmentation fault (11)
> [pvfs2-compute-0-0:05447] Signal code: Address not mapped (1)
> [pvfs2-compute-0-0:05447] Failing at address: (nil)
> [pvfs2-compute-0-0:05448] *** Process received signal ***
> [pvfs2-compute-0-0:05448] Signal: Segmentation fault (11)
> [pvfs2-compute-0-0:05448] Signal code: Address not mapped (1)
> [pvfs2-compute-0-0:05448] Failing at address: (nil)
> [pvfs2-compute-0-0:05448] [ 0] /lib/tls/libpthread.so.0 [0xbb2890]
> [pvfs2-compute-0-0:05448] [ 1] /opt/openmpi/lib/openmpi/
> mca_bml_r2.so(mca_bml_r2_progress+0x39) [0x4b1d99]
> [pvfs2-compute-0-0:05448] [ 2] /opt/openmpi/lib/libopen-pal.so.
> 0(opal_progress+0x65) [0x592265]
> [pvfs2-compute-0-0:05448] [ 3] /opt/openmpi/lib/openmpi/
> mca_oob_tcp.so(mca_oob_tcp_msg_wait+0x29) [0x20a731]
> [pvfs2-compute-0-0:05448] [ 4] /opt/openmpi/lib/openmpi/
> mca_oob_tcp.so(mca_oob_tcp_recv+0x365) [0x20f301]
> [pvfs2-compute-0-0:05448] [ 5] /opt/openmpi/lib/libopen-rte.so.
> 0(mca_oob_recv_packed+0x38) [0x13c6a0]
> [pvfs2-compute-0-0:05448] [ 6] /opt/openmpi/lib/libopen-rte.so.
> 0(mca_oob_xcast+0xa0e) [0x13d36a]
> [pvfs2-compute-0-0:05448] [ 7] /opt/openmpi/lib/libmpi.so.
> 0(ompi_mpi_init+0x566) [0xda9f22]
> [pvfs2-compute-0-0:05447] [ 0] /lib/tls/libpthread.so.0 [0xbb2890]
> [pvfs2-compute-0-0:05447] [ 1] /opt/openmpi/lib/openmpi/
> mca_bml_r2.so(mca_bml_r2_progress+0x39) [0x305d99]
> [pvfs2-compute-0-0:05447] [ 2] /opt/openmpi/lib/libopen-pal.so.
> 0(opal_progress+0x65) [0x9fb265]
> [pvfs2-compute-0-0:05447] [ 3] /opt/openmpi/lib/openmpi/
> mca_oob_tcp.so(mca_oob_tcp_msg_wait+0x29) [0x2ed731]
> [pvfs2-compute-0-0:05447] [ 4] /opt/openmpi/lib/openmpi/
> mca_oob_tcp.so(mca_oob_tcp_recv+0x365) [0x2f2301]
> [pvfs2-compute-0-0:05447] [ 5] /opt/openmpi/lib/libopen-rte.so.
> 0(mca_oob_recv_packed+0x38) [0x53c6a0]
> [pvfs2-compute-0-0:05447] [ 6] /opt/openmpi/lib/openmpi/
> mca_gpr_proxy.so(orte_gpr_proxy_put+0x1b0) [0x2c4fc8]
> [pvfs2-compute-0-0:05447] [ 7] /opt/openmpi/lib/libopen-rte.so.
> 0(orte_smr_base_set_proc_state+0x244) [0x551420]
> [pvfs2-compute-0-0:05447] [ 8] /opt/openmpi/lib/libmpi.so.
> 0(ompi_mpi_init+0x52e) [0x13ceea]
> [pvfs2-compute-0-0:05447] [ 9] /opt/openmpi/lib/libmpi.so.
> 0(PMPI_Init_thread+0x5c) [0x15e844]
> [pvfs2-compute-0-0:05447] [10] justhi(main+0x36) [0x8048782]
> [pvfs2-compute-0-0:05448] [ 8] /opt/openmpi/lib/libmpi.so.
> 0(PMPI_Init_thread+0x5c) [0xdcb844]
> [pvfs2-compute-0-0:05448] [ 9] justhi(main+0x36) [0x8048782]
> [pvfs2-compute-0-0:05448] [10] /lib/tls/libc.so.6(__libc_start_main
> +0xd3) [0x970de3]
> [pvfs2-compute-0-0:05448] [11] justhi [0x80486c5]
> [pvfs2-compute-0-0:05448] *** End of error message ***
> [pvfs2-compute-0-0:05447] [11] /lib/tls/libc.so.6(__libc_start_main
> +0xd3) [0x1a0de3]
> [pvfs2-compute-0-0:05447] [12] justhi [0x80486c5]
> [pvfs2-compute-0-0:05447] *** End of error message ***
> [pvfs2-compute-0-0.local:05446] [0,0,1] orted_recv_pls: received
> message from [0,0,0]
> [pvfs2-compute-0-0.local:05446] [0,0,1] orted_recv_pls: received
> kill_local_procs
>
>
> [Ctrl+Z and kill -9 is needed to finish the execution]
>
> The machinefile contains:
>
> c0-0 slots=4
> c0-1 slots=4
> c0-2 slots=4
> c0-3 slots=4
> ...
>
> If processes are forced to be spawned in different nodes (c0-0
> slots=1,
> c0-1 slots=1, c0-2 slots=1, c0-3 slots=1...) then there is no
> error :-?
> With 1.2.3 version (same *configure* options) everything runs
> perfectly.
>
> The ompi_info for my openmpi 1.2.4 installation:
> Open MPI: 1.2.4
> Open MPI SVN revision: r16187
> Open RTE: 1.2.4
> Open RTE SVN revision: r16187
> OPAL: 1.2.4
> OPAL SVN revision: r16187
> Prefix: /opt/openmpi
> Configured architecture: i686-pc-linux-gnu
> Configured by: root
> Configured on: Sun Nov 25 20:13:42 CET 2007
> Configure host: pvfs2-compute-0-0.local
> Built by: root
> Built on: Sun Nov 25 20:19:55 CET 2007
> Built host: pvfs2-compute-0-0.local
> C bindings: yes
> C++ bindings: yes
> Fortran77 bindings: no
> Fortran90 bindings: no
> Fortran90 bindings size: na
> C compiler: gcc
> C compiler absolute: /usr/bin/gcc
> C++ compiler: g++
> C++ compiler absolute: /usr/bin/g++
> Fortran77 compiler: none
> Fortran77 compiler abs: none
> Fortran90 compiler: none
> Fortran90 compiler abs: none
> C profiling: yes
> C++ profiling: yes
> Fortran77 profiling: no
> Fortran90 profiling: no
> C++ exceptions: no
> Thread support: posix (mpi: yes, progress: no)
> Internal debug support: no
> MPI parameter check: runtime
> Memory profiling support: no
> Memory debugging support: no
> libltdl support: yes
> Heterogeneous support: yes
> mpirun default --prefix: no
> MCA backtrace: execinfo (MCA v1.0, API v1.0, Component
> v1.2.4)
> MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component
> v1.2.4)
> MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.2.4)
> MCA maffinity: first_use (MCA v1.0, API v1.0, Component
> v1.2.4)
> MCA maffinity: libnuma (MCA v1.0, API v1.0, Component
> v1.2.4)
> MCA timer: linux (MCA v1.0, API v1.0, Component v1.2.4)
> MCA installdirs: env (MCA v1.0, API v1.0, Component v1.2.4)
> MCA installdirs: config (MCA v1.0, API v1.0, Component v1.2.4)
> MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
> MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
> MCA coll: basic (MCA v1.0, API v1.0, Component v1.2.4)
> MCA coll: self (MCA v1.0, API v1.0, Component v1.2.4)
> MCA coll: sm (MCA v1.0, API v1.0, Component v1.2.4)
> MCA coll: tuned (MCA v1.0, API v1.0, Component v1.2.4)
> MCA io: romio (MCA v1.0, API v1.0, Component v1.2.4)
> MCA mpool: rdma (MCA v1.0, API v1.0, Component v1.2.4)
> MCA mpool: sm (MCA v1.0, API v1.0, Component v1.2.4)
> MCA pml: cm (MCA v1.0, API v1.0, Component v1.2.4)
> MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.2.4)
> MCA bml: r2 (MCA v1.0, API v1.0, Component v1.2.4)
> MCA rcache: vma (MCA v1.0, API v1.0, Component v1.2.4)
> MCA btl: self (MCA v1.0, API v1.0.1, Component v1.2.4)
> MCA btl: sm (MCA v1.0, API v1.0.1, Component v1.2.4)
> MCA btl: tcp (MCA v1.0, API v1.0.1, Component v1.0)
> MCA topo: unity (MCA v1.0, API v1.0, Component v1.2.4)
> MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.2.4)
> MCA errmgr: hnp (MCA v1.0, API v1.3, Component v1.2.4)
> MCA errmgr: orted (MCA v1.0, API v1.3, Component v1.2.4)
> MCA errmgr: proxy (MCA v1.0, API v1.3, Component v1.2.4)
> MCA gpr: null (MCA v1.0, API v1.0, Component v1.2.4)
> MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.2.4)
> MCA gpr: replica (MCA v1.0, API v1.0, Component
> v1.2.4)
> MCA iof: proxy (MCA v1.0, API v1.0, Component v1.2.4)
> MCA iof: svc (MCA v1.0, API v1.0, Component v1.2.4)
> MCA ns: proxy (MCA v1.0, API v2.0, Component v1.2.4)
> MCA ns: replica (MCA v1.0, API v2.0, Component
> v1.2.4)
> MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
> MCA ras: dash_host (MCA v1.0, API v1.3, Component
> v1.2.4)
> MCA ras: gridengine (MCA v1.0, API v1.3, Component
> v1.2.4)
> MCA ras: localhost (MCA v1.0, API v1.3, Component
> v1.2.4)
> MCA ras: slurm (MCA v1.0, API v1.3, Component v1.2.4)
> MCA rds: hostfile (MCA v1.0, API v1.3, Component
> v1.2.4)
> MCA rds: proxy (MCA v1.0, API v1.3, Component v1.2.4)
> MCA rds: resfile (MCA v1.0, API v1.3, Component
> v1.2.4)
> MCA rmaps: round_robin (MCA v1.0, API v1.3, Component
> v1.2.4)
> MCA rmgr: proxy (MCA v1.0, API v2.0, Component v1.2.4)
> MCA rmgr: urm (MCA v1.0, API v2.0, Component v1.2.4)
> MCA rml: oob (MCA v1.0, API v1.0, Component v1.2.4)
> MCA pls: gridengine (MCA v1.0, API v1.3, Component
> v1.2.4)
> MCA pls: proxy (MCA v1.0, API v1.3, Component v1.2.4)
> MCA pls: rsh (MCA v1.0, API v1.3, Component v1.2.4)
> MCA pls: slurm (MCA v1.0, API v1.3, Component v1.2.4)
> MCA sds: env (MCA v1.0, API v1.0, Component v1.2.4)
> MCA sds: pipe (MCA v1.0, API v1.0, Component v1.2.4)
> MCA sds: seed (MCA v1.0, API v1.0, Component v1.2.4)
> MCA sds: singleton (MCA v1.0, API v1.0, Component
> v1.2.4)
> MCA sds: slurm (MCA v1.0, API v1.0, Component v1.2.4)
>
> and the naive program I'm testing:
>
> $ cat justhi.c
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> #include <mpi.h>
>
> MPI_Status status;
>
> int main( int argc, char *argv[] )
> {
> int myid, nprocs, threadlevel = 0;
>
> MPI_Init_thread (&argc, &argv, MPI_THREAD_MULTIPLE, &threadlevel);
> // MPI_Init (&argc, &argv);
>
> MPI_Comm_rank (MPI_COMM_WORLD, &myid);
>
> MPI_Comm_size (MPI_COMM_WORLD, &nprocs);
>
> if (myid < 0)
> MPI_Abort (MPI_COMM_WORLD, 1);
> if (myid != 0) {
> fprintf (stdout, "Hi, P%d ready sir!\n", myid);
> } else {
> fprintf (stdout, "\nWho rules here!! (%d procs - thread-level: %d)
> \n\n", nprocs, threadlevel);
> }
>
> MPI_Finalize ();
>
> return (0);
> }
>
> compiled with:
> /opt/openmpi/bin/mpicc -Wall -o justhi justhi.c
>
> Thanks in advance :-)
> Emilio.
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
Cisco Systems