Hello all,
I sometimes run into deadlocks in OpenMPI (1.3.3a1r21206), when
running my MPI+threaded PT-Scotch software. Luckily, the case
is very small, with 4 procs only, so I have been able to investigate
it a bit. It seems that matches between commnications are not done
properly on cloned communicators. In the end, I run into a case where
a MPI_Waitall completes a MPI_Barrier on another proc. The bug is
erratic but quite easy to reproduce, luckily too.
To be sure, I ran my code into valgrind using helgrind, its
race condition detection tool. It produced much output, most
of which seems to be innocuous, yet I have some concerns about
such messages as the following ones. The ==12**== were generated
when running on 4 procs, while the ==83**== were generated
when running on 2 procs :
==8329== Possible data race during write of size 4 at 0x8882200
==8329== at 0x508B315: sm_fifo_write (btl_sm.h:254)
==8329== by 0x508B401: mca_btl_sm_send (btl_sm.c:811)
==8329== by 0x5070A0C: mca_bml_base_send_status (bml.h:288)
==8329== by 0x50708E6: mca_pml_ob1_send_request_start_copy
(pml_ob1_sendreq.c:567)
==8329== by 0x5064C30: mca_pml_ob1_send_request_start_btl
(pml_ob1_sendreq.h:363)
==8329== by 0x5064A19: mca_pml_ob1_send_request_start (pml_ob1_sendreq.h:429)
==8329== by 0x5064856: mca_pml_ob1_isend (pml_ob1_isend.c:87)
==8329== by 0x5142C46: ompi_coll_tuned_sendrecv_actual (coll_tuned_util.c:51)
==8329== by 0x514F379: ompi_coll_tuned_barrier_intra_two_procs
(coll_tuned_barrier.c:258)
==8329== by 0x5143252: ompi_coll_tuned_barrier_intra_dec_fixed
(coll_tuned_decision_fixed.c:192)
==8329== by 0x40E410C: PMPI_Barrier (pbarrier.c:59)
==8329== by 0x806C5FB: _SCOTCHdgraphInducePart (dgraph_induce.c:334)
==8329== Old state: shared-readonly by threads #1, #7
==8329== New state: shared-modified by threads #1, #7
==8329== Reason: this thread, #1, holds no consistent locks
==8329== Location 0x8882200 has never been protected by any lock
==1220== Possible data race during write of size 4 at 0x88CEF88
==1220== at 0x508CD84: sm_fifo_read (btl_sm.h:272)
==1220== by 0x508C864: mca_btl_sm_component_progress (btl_sm_component.c:391)
==1220== by 0x41F72DF: opal_progress (opal_progress.c:207)
==1220== by 0x40BD67D: opal_condition_wait (condition.h:85)
==1220== by 0x40BDA96: ompi_request_default_wait_all (req_wait.c:262)
==1220== by 0x5142C78: ompi_coll_tuned_sendrecv_actual (coll_tuned_util.c:55)
==1220== by 0x514F07A: ompi_coll_tuned_barrier_intra_recursivedoubling
(coll_tuned_barrier.c:174)
==1220== by 0x51432A3: ompi_coll_tuned_barrier_intra_dec_fixed
(coll_tuned_decision_fixed.c:208)
==1220== by 0x40E410C: PMPI_Barrier (pbarrier.c:59)
==1220== by 0x806C5FB: _SCOTCHdgraphInducePart (dgraph_induce.c:334)
==1220== by 0x805E2B2: kdgraphMapRbPartFold2 (kdgraph_map_rb_part.c:199)
==1220== by 0x805EA43: kdgraphMapRbPart2 (kdgraph_map_rb_part.c:331)
==1220== Old state: shared-readonly by threads #1, #7
==1220== New state: shared-modified by threads #1, #7
==1220== Reason: this thread, #1, holds no consistent locks
==1220== Location 0x88CEF88 has never been protected by any lock
==1219== Possible data race during write of size 4 at 0x891BC8C
==1219== at 0x508CD99: sm_fifo_read (btl_sm.h:273)
==1219== by 0x508C864: mca_btl_sm_component_progress (btl_sm_component.c:391)
==1219== by 0x41F72DF: opal_progress (opal_progress.c:207)
==1219== by 0x40BD67D: opal_condition_wait (condition.h:85)
==1219== by 0x40BDA96: ompi_request_default_wait_all (req_wait.c:262)
==1219== by 0x5142C78: ompi_coll_tuned_sendrecv_actual (coll_tuned_util.c:55)
==1219== by 0x514F07A: ompi_coll_tuned_barrier_intra_recursivedoubling
(coll_tuned_barrier.c:174)
==1219== by 0x51432A3: ompi_coll_tuned_barrier_intra_dec_fixed
(coll_tuned_decision_fixed.c:208)
==1219== by 0x40E410C: PMPI_Barrier (pbarrier.c:59)
==1219== by 0x806C5FB: _SCOTCHdgraphInducePart (dgraph_induce.c:334)
==1219== by 0x805E2B2: kdgraphMapRbPartFold2 (kdgraph_map_rb_part.c:199)
==1219== by 0x805EA43: kdgraphMapRbPart2 (kdgraph_map_rb_part.c:331)
==1219== Old state: shared-readonly by threads #1, #7
==1219== New state: shared-modified by threads #1, #7
==1219== Reason: this thread, #1, holds no consistent locks
==1219== Location 0x891BC8C has never been protected by any lock
==1220== Possible data race during write of size 4 at 0x4243A68
==1220== at 0x41F72A7: opal_progress (opal_progress.c:186)
==1220== by 0x40BD67D: opal_condition_wait (condition.h:85)
==1220== by 0x40BDA96: ompi_request_default_wait_all (req_wait.c:262)
==1220== by 0x5142C78: ompi_coll_tuned_sendrecv_actual (coll_tuned_util.c:55)
==1220== by 0x514F07A: ompi_coll_tuned_barrier_intra_recursivedoubling
(coll_tuned_barrier.c:174)
==1220== by 0x51432A3: ompi_coll_tuned_barrier_intra_dec_fixed
(coll_tuned_decision_fixed.c:208)
==1220== by 0x40E410C: PMPI_Barrier (pbarrier.c:59)
==1220== by 0x806C5FB: _SCOTCHdgraphInducePart (dgraph_induce.c:334)
==1220== by 0x805E2B2: kdgraphMapRbPartFold2 (kdgraph_map_rb_part.c:199)
==1220== by 0x805EA43: kdgraphMapRbPart2 (kdgraph_map_rb_part.c:331)
==1220== by 0x805EB86: _SCOTCHkdgraphMapRbPart (kdgraph_map_rb_part.c:421)
==1220== by 0x8057713: _SCOTCHkdgraphMapSt (kdgraph_map_st.c:182)
==1220== Old state: shared-readonly by threads #1, #7
==1220== New state: shared-modified by threads #1, #7
==1220== Reason: this thread, #1, holds no consistent locks
==1220== Location 0x4243A68 has never been protected by any lock
==8328== Possible data race during write of size 4 at 0x4532318
==8328== at 0x508A9B8: opal_atomic_lifo_pop (opal_atomic_lifo.h:111)
==8328== by 0x508A69F: mca_btl_sm_alloc (btl_sm.c:612)
==8328== by 0x5070571: mca_bml_base_alloc (bml.h:241)
==8328== by 0x5070778: mca_pml_ob1_send_request_start_copy
(pml_ob1_sendreq.c:506)
==8328== by 0x5064C30: mca_pml_ob1_send_request_start_btl
(pml_ob1_sendreq.h:363)
==8328== by 0x5064A19: mca_pml_ob1_send_request_start (pml_ob1_sendreq.h:429)
==8328== by 0x5064856: mca_pml_ob1_isend (pml_ob1_isend.c:87)
==8328== by 0x5142C46: ompi_coll_tuned_sendrecv_actual (coll_tuned_util.c:51)
==8328== by 0x514F379: ompi_coll_tuned_barrier_intra_two_procs
(coll_tuned_barrier.c:258)
==8328== by 0x5143252: ompi_coll_tuned_barrier_intra_dec_fixed
(coll_tuned_decision_fixed.c:192)
==8328== by 0x40E410C: PMPI_Barrier (pbarrier.c:59)
==8328== by 0x806C5FB: _SCOTCHdgraphInducePart (dgraph_induce.c:334)
==8328== Old state: shared-readonly by threads #1, #8
==8328== New state: shared-modified by threads #1, #8
==8328== Reason: this thread, #1, holds no consistent locks
==8328== Location 0x4532318 has never been protected by any lock
==8329== Possible data race during write of size 4 at 0x452F238
==8329== at 0x5067FD3: recv_req_matched (pml_ob1_recvreq.h:219)
==8329== by 0x5067D95: mca_pml_ob1_recv_frag_callback_match
(pml_ob1_recvfrag.c:191)
==8329== by 0x508C9BB: mca_btl_sm_component_progress (btl_sm_component.c:426)
==8329== by 0x41F72DF: opal_progress (opal_progress.c:207)
==8329== by 0x40BD67D: opal_condition_wait (condition.h:85)
==8329== by 0x40BDA96: ompi_request_default_wait_all (req_wait.c:262)
==8329== by 0x5142C78: ompi_coll_tuned_sendrecv_actual (coll_tuned_util.c:55)
==8329== by 0x514F379: ompi_coll_tuned_barrier_intra_two_procs
(coll_tuned_barrier.c:258)
==8329== by 0x5143252: ompi_coll_tuned_barrier_intra_dec_fixed
(coll_tuned_decision_fixed.c:192)
==8329== by 0x40E410C: PMPI_Barrier (pbarrier.c:59)
==8329== by 0x806C5FB: _SCOTCHdgraphInducePart (dgraph_induce.c:334)
==8329== by 0x805E2B2: kdgraphMapRbPartFold2 (kdgraph_map_rb_part.c:199)
==8329== Old state: owned exclusively by thread #7
==8329== New state: shared-modified by threads #1, #7
==8329== Reason: this thread, #1, holds no locks at all
==8329== Possible data race during write of size 4 at 0x452F2DC
==8329== at 0x40D5946: ompi_convertor_unpack (convertor.c:280)
==8329== by 0x5067E78: mca_pml_ob1_recv_frag_callback_match
(pml_ob1_recvfrag.c:215)
==8329== by 0x508C9BB: mca_btl_sm_component_progress (btl_sm_component.c:426)
==8329== by 0x41F72DF: opal_progress (opal_progress.c:207)
==8329== by 0x40BD67D: opal_condition_wait (condition.h:85)
==8329== by 0x40BDA96: ompi_request_default_wait_all (req_wait.c:262)
==8329== by 0x5142C78: ompi_coll_tuned_sendrecv_actual (coll_tuned_util.c:55)
==8329== by 0x514F379: ompi_coll_tuned_barrier_intra_two_procs
(coll_tuned_barrier.c:258)
==8329== by 0x5143252: ompi_coll_tuned_barrier_intra_dec_fixed
(coll_tuned_decision_fixed.c:192)
==8329== by 0x40E410C: PMPI_Barrier (pbarrier.c:59)
==8329== by 0x806C5FB: _SCOTCHdgraphInducePart (dgraph_induce.c:334)
==8329== by 0x805E2B2: kdgraphMapRbPartFold2 (kdgraph_map_rb_part.c:199)
==8329== Old state: owned exclusively by thread #7
==8329== New state: shared-modified by threads #1, #7
==8329== Reason: this thread, #1, holds no locks at all
I guess the following are ok, but I provide them as a
reference :
==1220== Possible data race during write of size 4 at 0x8968780
==1220== at 0x508A619: opal_atomic_unlock (atomic_impl.h:367)
==1220== by 0x508B468: mca_btl_sm_send (btl_sm.c:811)
==1220== by 0x5070A0C: mca_bml_base_send_status (bml.h:288)
==1220== by 0x50708E6: mca_pml_ob1_send_request_start_copy
(pml_ob1_sendreq.c:567)
==1220== by 0x5064C30: mca_pml_ob1_send_request_start_btl
(pml_ob1_sendreq.h:363)
==1220== by 0x5064A19: mca_pml_ob1_send_request_start (pml_ob1_sendreq.h:429)
==1220== by 0x5064856: mca_pml_ob1_isend (pml_ob1_isend.c:87)
==1220== by 0x5142C46: ompi_coll_tuned_sendrecv_actual (coll_tuned_util.c:51)
==1220== by 0x514F07A: ompi_coll_tuned_barrier_intra_recursivedoubling
(coll_tuned_barrier.c:174)
==1220== by 0x51432A3: ompi_coll_tuned_barrier_intra_dec_fixed
(coll_tuned_decision_fixed.c:208)
==1220== by 0x40E410C: PMPI_Barrier (pbarrier.c:59)
==1220== by 0x806C5FB: _SCOTCHdgraphInducePart (dgraph_induce.c:334)
==1220== Old state: shared-modified by threads #1, #7
==1220== New state: shared-modified by threads #1, #7
==1220== Reason: this thread, #1, holds no consistent locks
==1220== Location 0x8968780 has never been protected by any lock
ompi_info says :
Package: Open MPI pelegrin_at_brol Distribution
Open MPI: 1.3.3a1r21206
Open MPI SVN revision: r21206
Open MPI release date: Unreleased developer copy
Open RTE: 1.3.3a1r21206
Open RTE SVN revision: r21206
Open RTE release date: Unreleased developer copy
OPAL: 1.3.3a1r21206
OPAL SVN revision: r21206
OPAL release date: Unreleased developer copy
Ident string: 1.3.3a1r21206
Prefix: /usr/local
Configured architecture: i686-pc-linux-gnu
Configure host: brol
Configured by: pelegrin
Configured on: Tue May 12 15:50:08 CEST 2009
Configure host: brol
Built by: pelegrin
Built on: Tue May 12 16:17:34 CEST 2009
Built host: brol
C bindings: yes
C++ bindings: yes
Fortran77 bindings: yes (all)
Fortran90 bindings: yes
Fortran90 bindings size: small
C compiler: gcc
C compiler absolute: /usr/bin/gcc
C++ compiler: g++
C++ compiler absolute: /usr/bin/g++
Fortran77 compiler: gfortran
Fortran77 compiler abs: /usr/bin/gfortran
Fortran90 compiler: gfortran
Fortran90 compiler abs: /usr/bin/gfortran
C profiling: yes
C++ profiling: yes
Fortran77 profiling: yes
Fortran90 profiling: yes
C++ exceptions: no
Thread support: posix (mpi: yes, progress: no)
Sparse Groups: no
Internal debug support: yes
MPI parameter check: always
Memory profiling support: no
Memory debugging support: yes
libltdl support: yes
Heterogeneous support: no
mpirun default --prefix: no
MPI I/O support: yes
MPI_WTIME support: gettimeofday
Symbol visibility support: yes
FT Checkpoint support: no (checkpoint thread: no)
MCA backtrace: execinfo (MCA v2.0, API v2.0, Component v1.3.3)
MCA memchecker: valgrind (MCA v2.0, API v2.0, Component v1.3.3)
MCA memory: ptmalloc2 (MCA v2.0, API v2.0, Component v1.3.3)
MCA paffinity: linux (MCA v2.0, API v2.0, Component v1.3.3)
MCA carto: auto_detect (MCA v2.0, API v2.0, Component v1.3.3)
MCA carto: file (MCA v2.0, API v2.0, Component v1.3.3)
MCA maffinity: first_use (MCA v2.0, API v2.0, Component v1.3.3)
MCA timer: linux (MCA v2.0, API v2.0, Component v1.3.3)
MCA installdirs: env (MCA v2.0, API v2.0, Component v1.3.3)
MCA installdirs: config (MCA v2.0, API v2.0, Component v1.3.3)
MCA dpm: orte (MCA v2.0, API v2.0, Component v1.3.3)
MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.3.3)
MCA allocator: basic (MCA v2.0, API v2.0, Component v1.3.3)
MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.3.3)
MCA coll: basic (MCA v2.0, API v2.0, Component v1.3.3)
MCA coll: hierarch (MCA v2.0, API v2.0, Component v1.3.3)
MCA coll: inter (MCA v2.0, API v2.0, Component v1.3.3)
MCA coll: self (MCA v2.0, API v2.0, Component v1.3.3)
MCA coll: sm (MCA v2.0, API v2.0, Component v1.3.3)
MCA coll: sync (MCA v2.0, API v2.0, Component v1.3.3)
MCA coll: tuned (MCA v2.0, API v2.0, Component v1.3.3)
MCA io: romio (MCA v2.0, API v2.0, Component v1.3.3)
MCA mpool: fake (MCA v2.0, API v2.0, Component v1.3.3)
MCA mpool: rdma (MCA v2.0, API v2.0, Component v1.3.3)
MCA mpool: sm (MCA v2.0, API v2.0, Component v1.3.3)
MCA pml: cm (MCA v2.0, API v2.0, Component v1.3.3)
MCA pml: csum (MCA v2.0, API v2.0, Component v1.3.3)
MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.3.3)
MCA pml: v (MCA v2.0, API v2.0, Component v1.3.3)
MCA bml: r2 (MCA v2.0, API v2.0, Component v1.3.3)
MCA rcache: vma (MCA v2.0, API v2.0, Component v1.3.3)
MCA btl: self (MCA v2.0, API v2.0, Component v1.3.3)
MCA btl: sm (MCA v2.0, API v2.0, Component v1.3.3)
MCA btl: tcp (MCA v2.0, API v2.0, Component v1.3.3)
MCA topo: unity (MCA v2.0, API v2.0, Component v1.3.3)
MCA osc: pt2pt (MCA v2.0, API v2.0, Component v1.3.3)
MCA osc: rdma (MCA v2.0, API v2.0, Component v1.3.3)
MCA iof: hnp (MCA v2.0, API v2.0, Component v1.3.3)
MCA iof: orted (MCA v2.0, API v2.0, Component v1.3.3)
MCA iof: tool (MCA v2.0, API v2.0, Component v1.3.3)
MCA oob: tcp (MCA v2.0, API v2.0, Component v1.3.3)
MCA odls: default (MCA v2.0, API v2.0, Component v1.3.3)
MCA ras: slurm (MCA v2.0, API v2.0, Component v1.3.3)
MCA rmaps: rank_file (MCA v2.0, API v2.0, Component v1.3.3)
MCA rmaps: round_robin (MCA v2.0, API v2.0, Component v1.3.3)
MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.3.3)
MCA rml: oob (MCA v2.0, API v2.0, Component v1.3.3)
MCA routed: binomial (MCA v2.0, API v2.0, Component v1.3.3)
MCA routed: direct (MCA v2.0, API v2.0, Component v1.3.3)
MCA routed: linear (MCA v2.0, API v2.0, Component v1.3.3)
MCA plm: rsh (MCA v2.0, API v2.0, Component v1.3.3)
MCA plm: slurm (MCA v2.0, API v2.0, Component v1.3.3)
MCA filem: rsh (MCA v2.0, API v2.0, Component v1.3.3)
MCA errmgr: default (MCA v2.0, API v2.0, Component v1.3.3)
MCA ess: env (MCA v2.0, API v2.0, Component v1.3.3)
MCA ess: hnp (MCA v2.0, API v2.0, Component v1.3.3)
MCA ess: singleton (MCA v2.0, API v2.0, Component v1.3.3)
MCA ess: slurm (MCA v2.0, API v2.0, Component v1.3.3)
MCA ess: tool (MCA v2.0, API v2.0, Component v1.3.3)
MCA grpcomm: bad (MCA v2.0, API v2.0, Component v1.3.3)
MCA grpcomm: basic (MCA v2.0, API v2.0, Component v1.3.3)
Thanks in advance for any help / explanation,
f.p.
|