Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: hpetit_at_[hidden]
Date: 2007-04-04 12:50:38


Hi,

I have a problem of MPI 1.2.0rc being locked in a "pthread_condition_wait" call.
This happen whatever the application when openmpi has been compiled with multi-thread support.

The full "configure" options are
"./configure --prefix=/usr/local/Mpi/openmpi-1.2 --enable-mpi-threads
--enable-progress-threads --with-threads=posix --enable-smp-lock"

An example of GDB session is provided here below:

-------------------------------------------------------------------------------------------------------------
>GNU gdb 6.3-debian
>Copyright 2004 Free Software Foundation, Inc.
>GDB is free software, covered by the GNU General Public License, and
>you are welcome to change it and/or distribute copies of it under certain
>conditions.
>Type "show copying" to see the conditions.
>There is absolutely no warranty for GDB. Type "show warranty" for
>details.
>This GDB was configured as "i386-linux"...Using host libthread_db
>library "/lib/tls/libthread_db.so.1".
>
>(gdb) run -np 1 spawn6
>Starting program: /usr/local/openmpi-1.2.0/bin/mpirun -np 1 spawn6
>[Thread debugging using libthread_db enabled]
>[New Thread 1076191360 (LWP 29006)]
>[New Thread 1084808112 (LWP 29009)]
>main*******************************
>main : Lancement MPI*
>
>Program received signal SIGINT, Interrupt.
>[Switching to Thread 1084808112 (LWP 29009)]
>0x401f0523 in poll () from /lib/tls/libc.so.6
>(gdb) where
>#0 0x401f0523 in poll () from /lib/tls/libc.so.6
>#1 0x40081c7c in opal_poll_dispatch () from
>/usr/local/openmpi-1.2.0/lib/libopen-pal.so.0
>#2 0x4007e4f1 in opal_event_base_loop () from
>/usr/local/openmpi-1.2.0/lib/libopen-pal.so.0
>#3 0x4007e36b in opal_event_loop () from
>/usr/local/openmpi-1.2.0/lib/libopen-pal.so.0
>#4 0x4007f423 in opal_event_run () from
>/usr/local/openmpi-1.2.0/lib/libopen-pal.so.0
>#5 0x40115b63 in start_thread () from /lib/tls/libpthread.so.0
>#6 0x401f918a in clone () from /lib/tls/libc.so.6
>(gdb) bt
>#0 0x401f0523 in poll () from /lib/tls/libc.so.6
>#1 0x40081c7c in opal_poll_dispatch () from
>/usr/local/openmpi-1.2.0/lib/libopen-pal.so.0
>#2 0x4007e4f1 in opal_event_base_loop () from
>/usr/local/openmpi-1.2.0/lib/libopen-pal.so.0
>#3 0x4007e36b in opal_event_loop () from
>/usr/local/openmpi-1.2.0/lib/libopen-pal.so.0
>#4 0x4007f423 in opal_event_run () from
>/usr/local/openmpi-1.2.0/lib/libopen-pal.so.0
>#5 0x40115b63 in start_thread () from /lib/tls/libpthread.so.0
>#6 0x401f918a in clone () from /lib/tls/libc.so.6
>(gdb) info threads
>* 2 Thread 1084808112 (LWP 29009) 0x401f0523 in poll () from
>/lib/tls/libc.so.6
> 1 Thread 1076191360 (LWP 29006) 0x40118295 in
>pthread_cond_wait@@GLIBC_2.3.2 () from /lib/tls/libpthread.so.0
>(gdb) thread 1
>[Switching to thread 1 (Thread 1076191360 (LWP 29006))]#0 0x40118295
>in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/tls/libpthread.so.0
>(gdb) bt
>#0 0x40118295 in pthread_cond_wait@@GLIBC_2.3.2 () from
>/lib/tls/libpthread.so.0
>#1 0x0804cb68 in opal_condition_wait (c=0x8050e4c, m=0x8050e28) at
>condition.h:64
>#2 0x0804a4fe in orterun (argc=4, argv=0xbffff844) at orterun.c:436
>#3 0x0804a046 in main (argc=4, argv=0xbffff844) at main.c:13
>(gdb) where
>#0 0x40118295 in pthread_cond_wait@@GLIBC_2.3.2 () from
>/lib/tls/libpthread.so.0
>#1 0x0804cb68 in opal_condition_wait (c=0x8050e4c, m=0x8050e28) at
>condition.h:64
>#2 0x0804a4fe in orterun (argc=4, argv=0xbffff844) at orterun.c:436
>#3 0x0804a046 in main (argc=4, argv=0xbffff844) at main.c:13

-------------------------------------------------------------------------------------------------------------

I have read the other threads related to multi-threads support. I have understood that multi-thread support will not be a priority before the end of the year.

The thing is this locking stuff problem appeared only since 1.1.2 openmpi release and as it is a locking problem, I was wondering if you could do an exception and try to analyse this one before the end of the year.

Thanks,

Herve

P.S.: my OS is a debian sarge

------------------------ ALICE C'EST ENCORE MIEUX AVEC CANAL+ LE BOUQUET ! ---------------
Découvrez vite l'offre exclusive ALICEBOX et CANAL+ LE BOUQUET, en cliquant ici http://alicebox.fr
Soumis à conditions.