Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] MPI process hangs if OpenMPI is compiled with --enable-thread-multiple
From: Pierre Jolivet (jolivet_at_[hidden])
Date: 2013-11-23 21:22:32


Dominique,
It looks like you are compiling Open MPI with Homebrew. The flags they use in the formula when --enable-mpi-thread-multiple is wrong.
c.f. a similar problem with MacPorts https://lists.macosforge.org/pipermail/macports-tickets/2013-June/138145.html.

Pierre

On Nov 23, 2013, at 4:56 PM, Ralph Castain <rhc_at_[hidden]> wrote:

> Hmmm...well, it seems to work for me:
>
> $ mpirun -n 4 ./thread_init
> Calling MPI_Init_thread...
> Calling MPI_Init_thread...
> Calling MPI_Init_thread...
> Calling MPI_Init_thread...
> MPI_Init_thread returned, provided = 3
> MPI_Init_thread returned, provided = 3
> MPI_Init_thread returned, provided = 3
> MPI_Init_thread returned, provided = 3
> $
>
> This is with the current 1.7 code branch, so it's possible something has been updated. You might try it with the next nightly tarball and see if it helps.
>
> BTW: The correct configure option is --enable-mpi-thread-multiple
>
> My test program:
>
> #include <mpi.h>
> #include <stdio.h>
> int main(int argc, const char* argv[]) {
> int provided = -1;
> printf("Calling MPI_Init_thread...\n");
> MPI_Init_thread(NULL, NULL, MPI_THREAD_MULTIPLE, &provided);
> printf("MPI_Init_thread returned, provided = %d\n", provided);
> MPI_Finalize();
> return 0;
> }
>
>
> On Nov 21, 2013, at 1:36 PM, Dominique Orban <dominique.orban_at_[hidden]> wrote:
>
>> Hi,
>>
>> I'm compiling the example code at the bottom of the following page that illustrates MPI_Init_Thread():
>>
>> http://mpi.deino.net/mpi_functions/mpi_init_thread.html
>>
>> I have OpenMPI 1.7.3 installed on OSX 10.8.5 with --enable-thread-multiple compiled with clang-425.0.28. I can reproduce the following on OSX 10.9 (clang-500) and another user was able to reproduce it on some flavor of Linux:
>>
>> $ mpicc -g -o testmpi testmpi.c -lmpi
>> $ mpirun -n 2 ./testmpi
>> $ # hangs forever
>>
>> I've no knowledge of how to debug MPI programs but it was suggested to me to do this:
>>
>> $ mpirun -n 2 xterm -e gdb ./testmpi
>>
>> In the first xterm, I say 'run' in gdb, interrupt the program after a while and get a backtrace:
>>
>> ^C
>> Program received signal SIGINT, Interrupt.
>> 0x00007fff99116322 in select$DARWIN_EXTSN ()
>> from /usr/lib/system/libsystem_kernel.dylib
>> (gdb) where
>> #0 0x00007fff99116322 in select$DARWIN_EXTSN ()
>> from /usr/lib/system/libsystem_kernel.dylib
>> #1 0x00000001001963c2 in select_dispatch ()
>> from /usr/local/Cellar/open-mpi/1.7.3/lib/libopen-pal.6.dylib
>> #2 0x000000010018f178 in opal_libevent2021_event_base_loop ()
>> from /usr/local/Cellar/open-mpi/1.7.3/lib/libopen-pal.6.dylib
>> #3 0x000000010015f059 in opal_progress ()
>> from /usr/local/Cellar/open-mpi/1.7.3/lib/libopen-pal.6.dylib
>> #4 0x0000000100019321 in ompi_mpi_init () from /usr/local/lib/libmpi.1.dylib
>> #5 0x00000001000334da in MPI_Init_thread () from /usr/local/lib/libmpi.1.dylib
>> #6 0x0000000100000ddb in main (argc=1, argv=0x7fff5fbfedc0) at testmpi.c:9
>> (gdb) backtrace
>> #0 0x00007fff99116322 in select$DARWIN_EXTSN ()
>> from /usr/lib/system/libsystem_kernel.dylib
>> #1 0x00000001001963c2 in select_dispatch ()
>> from /usr/local/Cellar/open-mpi/1.7.3/lib/libopen-pal.6.dylib
>> #2 0x000000010018f178 in opal_libevent2021_event_base_loop ()
>> from /usr/local/Cellar/open-mpi/1.7.3/lib/libopen-pal.6.dylib
>> #3 0x000000010015f059 in opal_progress ()
>> from /usr/local/Cellar/open-mpi/1.7.3/lib/libopen-pal.6.dylib
>> #4 0x0000000100019321 in ompi_mpi_init () from /usr/local/lib/libmpi.1.dylib
>> #5 0x00000001000334da in MPI_Init_thread () from /usr/local/lib/libmpi.1.dylib
>> #6 0x0000000100000ddb in main (argc=1, argv=0x7fff5fbfedc0) at testmpi.c:9
>> (gdb)
>>
>> In the second xterm window:
>>
>> ^C
>> Program received signal SIGINT, Interrupt.
>> 0x00000001002e9a28 in mca_common_sm_init ()
>> from /usr/local/Cellar/open-mpi/1.7.3/lib/libmca_common_sm.4.dylib
>> (gdb) where
>> #0 0x00000001002e9a28 in mca_common_sm_init ()
>> from /usr/local/Cellar/open-mpi/1.7.3/lib/libmca_common_sm.4.dylib
>> #1 0x00000001002e5a38 in mca_mpool_sm_init ()
>> from /usr/local/Cellar/open-mpi/1.7.3/lib/openmpi/mca_mpool_sm.so
>> #2 0x00000001000793fa in mca_mpool_base_module_create ()
>> from /usr/local/lib/libmpi.1.dylib
>> #3 0x000000010053fb41 in mca_btl_sm_add_procs ()
>> from /usr/local/Cellar/open-mpi/1.7.3/lib/openmpi/mca_btl_sm.so
>> #4 0x0000000100535dfb in mca_bml_r2_add_procs ()
>> from /usr/local/Cellar/open-mpi/1.7.3/lib/openmpi/mca_bml_r2.so
>> #5 0x000000010051e59c in mca_pml_ob1_add_procs ()
>> from /usr/local/Cellar/open-mpi/1.7.3/lib/openmpi/mca_pml_ob1.so
>> #6 0x000000010001959b in ompi_mpi_init () from /usr/local/lib/libmpi.1.dylib
>> #7 0x00000001000334da in MPI_Init_thread () from /usr/local/lib/libmpi.1.dylib
>> #8 0x0000000100000ddb in main (argc=1, argv=0x7fff5fbfedc0) at testmpi.c:9
>> (gdb)
>>
>>
>> The output of `ompi_info --parsable` is here: https://gist.github.com/7590040
>>
>> Thanks in advance.
>>
>> Dominique
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users