Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: [OMPI devel] opal_condition
From: Tim Prins (tprins_at_[hidden])
Date: 2007-12-05 15:10:24


Hi,

Last night we had one of our threaded builds on the trunk hang when
running make check on the test opal_condition in test/threads/

After running the test about 30-40 times, I was only able to get it to
hang once. Looking at it is gdb, we get:

(gdb) info threads
   3 Thread 1084229984 (LWP 8450) 0x0000002a95e3bba9 in sched_yield ()
from /lib64/tls/libc.so.6
   2 Thread 1094719840 (LWP 8451) 0xffffffffff600012 in ?? ()
   1 Thread 182904955328 (LWP 8430) 0x0000002a9567309b in pthread_join
() from /lib64/tls/libpthread.so.0
(gdb) thread 2
[Switching to thread 2 (Thread 1094719840 (LWP 8451))]#0
0xffffffffff600012 in ?? ()
(gdb) bt
#0 0xffffffffff600012 in ?? ()
#1 0x0000000000000001 in ?? ()
#2 0x0000000000000000 in ?? ()
(gdb) thread 1
[Switching to thread 1 (Thread 182904955328 (LWP 8430))]#0
0x0000002a9567309b in pthread_join () from /lib64/tls/libpthread.so.0
(gdb) bt
#0 0x0000002a9567309b in pthread_join () from /lib64/tls/libpthread.so.0
#1 0x0000002a95794a7d in opal_thread_join () from
/san/homedirs/mpiteam/mtt-runs/odin/20071204-Nightly/pb_2/installs/Bp80/src/openmpi-1.3a1r16847/opal/.libs/libopen-pal.so.0
#2 0x0000000000401684 in main ()
(gdb) thread 3
[Switching to thread 3 (Thread 1084229984 (LWP 8450))]#0
0x0000002a95e3bba9 in sched_yield () from /lib64/tls/libc.so.6
(gdb) bt
#0 0x0000002a95e3bba9 in sched_yield () from /lib64/tls/libc.so.6
#1 0x0000000000401216 in thr1_run ()
#2 0x0000002a95672137 in start_thread () from /lib64/tls/libpthread.so.0
#3 0x0000002a95e53113 in clone () from /lib64/tls/libc.so.6
(gdb)

I know, this is not very helpful, but I have no idea what is going on.
There have been no changes in this code area for a long time.

Has anyone else seen something like this? Any ideas what is going on?

Thanks,

Tim