Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: George Bosilca (bosilca_at_[hidden])
Date: 2006-05-16 19:11:26


Commit 9946 solve the problem. I mixed the return value of the
trylock call, considering that any not zero value was a success when
in fact 0 is a success. Anyway, now it's fixed on the trunk.

   george.

On May 16, 2006, at 11:07 AM, Rolf Vandevaart wrote:

>
> Hi Brian:
>
> Here is the stack trace from the core dump. I am also trying to
> understand
> better what is happening here, but I figured I needed to get this off
> to you.
> Rolf
>
> burl-ct-v440-4 96 =>dbx connectivity core
> For information about new features see `help changes'
> To remove this message, put `dbxenv suppress_startup_message 7.4' in
> your .dbxrc
> Reading connectivity
> core file header read successfully
> [...snip...]
> (dbx) where
> current thread: t_at_1
> [1] _lwp_kill(0x0, 0x6, 0x0, 0x6, 0xfc00, 0x0), at 0xfd840f90
> [2] raise(0x6, 0x0, 0xfd824a98, 0xffffffff, 0xfd868284, 0x6), at
> 0xfd7dfd78
> [3] abort(0xffbfee00, 0x1, 0x0, 0xa83f0, 0xfd86b298, 0x0), at
> 0xfd7bff98
> =>[4] opal_mutex_lock(m = 0xfd0b12e8), line 101 in "mutex_unix.h"
> [5] __ompi_free_list_wait(fl = 0xfd0b1298, item = 0xffbfef88), line
> 167 in "ompi_free_list.h"
> [6] mca_pml_ob1_recv_frag_match(btl = 0xfcfbc778, hdr = 0xdc897260,
> segments = 0xdc897218, num_segments = 1U), line 550 in
> "pml_ob1_recvfrag.c"
> [7] mca_pml_ob1_recv_frag_callback(btl = 0xfcfbc778, tag =
> '\001', des
> = 0xdc8971d0, cbdata = (nil)), line 80 in "pml_ob1_recvfrag.c"
> [8] mca_btl_sm_component_progress(), line 396 in
> "btl_sm_component.c"
> [9] mca_bml_r2_progress(), line 103 in "bml_r2.c"
> [10] opal_progress(), line 288 in "opal_progress.c"
> [11] opal_condition_wait(c = 0xff29d3b8, m = 0xff29d430), line 75 in
> "condition.h"
> [12] mca_pml_ob1_recv(addr = 0xffbff4b0, count = 1U, datatype =
> 0x21458, src = 0, tag = 0, comm = 0x215a0, status = 0xffbff4c0), line
> 101 in "pml_ob1_irecv.c"
> [13] PMPI_Recv(buf = 0xffbff4b0, count = 1, type = 0x21458, source =
> 0, tag = 0, comm = 0x215a0, status = 0xffbff4c0), line 66 in "precv.c"
> [14] main(argc = 2, argv = 0xffbff53c), line 69 in "connectivity.c"
> (dbx)
>
>
>
> Brian Barrett wrote On 05/11/06 02:57,:
>
>> Eeeks! That sounds like a bug. Can you attach a debugger and get a
>> stack trace for the situation where that occurs?
>>
>> Brian
>>
>> On May 10, 2006, at 10:17 PM, Rolf Vandevaart wrote:
>>
>>
>>
>>> I have built a library with "--enable-mpi-threads --with-
>>> threads=posix"
>>> (using
>>> the trunk) and tried running a simple non-threaded program linked
>>> against it.
>>> The program just calls to MPI_Send and MPI_Recv so every process
>>> sends an
>>> MPI_INT to one another.
>>>
>>> When I run it I see the following:
>>>
>>> burl-ct-v440-4 86 =>mpirun -np 4 connectivity -v
>>> burl-ct-v440-4: checking connection 0 <-> 1
>>> burl-ct-v440-4: checking connection 1 <-> 2
>>> burl-ct-v440-4: checking connection 0 <-> 2
>>> opal_mutex_lock(): Deadlock situation detected/avoided
>>> Signal:6 info.si_errno:0(Error 0) si_code:-1()
>>> *** End of error message ***
>>> burl-ct-v440-4 87 =>
>>>
>>> Since I had the debug enabled, I get to see that one of the
>>> processes
>>> was trying to grab a lock that it already head. (Nice feature
>>> having
>>> that error printed out!)
>>>
>>> Has anyone else seen this? As I said, this is a non-threaded
>>> program
>>> so there is only one thread per process. I am wondering if I am
>>> missing
>>> something basic in the building of my library. This test works
>>> fine against
>>> a library configured without "--enable-mpi-threads --with-
>>> threads=posix".
>>>
>>> Rolf
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>>
>>> =========================
>>> rolf.vandevaart_at_[hidden]
>>> 781-442-3043
>>> =========================
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>
> --
>
> =========================
> rolf.vandevaart_at_[hidden]
> 781-442-3043
> =========================
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

"Half of what I say is meaningless; but I say it so that the other
half may reach you"
                                   Kahlil Gibran