Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] init_thread + spawn error
From: Tim Prins (tprins_at_[hidden])
Date: 2008-04-04 12:06:06


Thanks for the report. As Ralph indicated the threading support in Open
MPI is not good right now, but we are working to make it better.

I have filed a ticket (https://svn.open-mpi.org/trac/ompi/ticket/1267)
so we do not loose track of this issue, and attached a potential fix to
the ticket.

Thanks,

Tim

Joao Vicente Lima wrote:
> Hi,
> I getting a error on call init_thread and comm_spawn on this code:
>
> #include "mpi.h"
> #include <stdio.h>
>
> int
> main (int argc, char *argv[])
> {
> int provided;
> MPI_Comm parentcomm, intercomm;
>
> MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
> MPI_Comm_get_parent (&parentcomm);
>
> if (parentcomm == MPI_COMM_NULL)
> {
> printf ("spawning ... \n");
> MPI_Comm_spawn ("./spawn1", MPI_ARGV_NULL, 1,
> MPI_INFO_NULL, 0, MPI_COMM_SELF, &intercomm, MPI_ERRCODES_IGNORE);
> MPI_Comm_disconnect (&intercomm);
> }
> else
> {
> printf ("child!\n");
> MPI_Comm_disconnect (&parentcomm);
> }
>
> MPI_Finalize ();
> return 0;
> }
>
> and the error is:
>
> spawning ...
> opal_mutex_lock(): Resource deadlock avoided
> [localhost:18718] *** Process received signal ***
> [localhost:18718] Signal: Aborted (6)
> [localhost:18718] Signal code: (-6)
> [localhost:18718] [ 0] /lib/libpthread.so.0 [0x2b6e5d9fced0]
> [localhost:18718] [ 1] /lib/libc.so.6(gsignal+0x35) [0x2b6e5dc3b3c5]
> [localhost:18718] [ 2] /lib/libc.so.6(abort+0x10e) [0x2b6e5dc3c73e]
> [localhost:18718] [ 3] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b6e5c9560ff]
> [localhost:18718] [ 4] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b6e5c95601d]
> [localhost:18718] [ 5] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b6e5c9560ac]
> [localhost:18718] [ 6] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b6e5c956a93]
> [localhost:18718] [ 7] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b6e5c9569dd]
> [localhost:18718] [ 8] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b6e5c95797d]
> [localhost:18718] [ 9]
> /usr/local/mpi/ompi-svn/lib/libmpi.so.0(ompi_proc_unpack+0x1ec)
> [0x2b6e5c957dd9]
> [localhost:18718] [10]
> /usr/local/mpi/ompi-svn/lib/openmpi/mca_dpm_orte.so [0x2b6e607f05cf]
> [localhost:18718] [11]
> /usr/local/mpi/ompi-svn/lib/libmpi.so.0(MPI_Comm_spawn+0x459)
> [0x2b6e5c98ede9]
> [localhost:18718] [12] ./spawn1(main+0x7a) [0x400ae2]
> [localhost:18718] [13] /lib/libc.so.6(__libc_start_main+0xf4) [0x2b6e5dc28b74]
> [localhost:18718] [14] ./spawn1 [0x4009d9]
> [localhost:18718] *** End of error message ***
> opal_mutex_lock(): Resource deadlock avoided
> [localhost:18719] *** Process received signal ***
> [localhost:18719] Signal: Aborted (6)
> [localhost:18719] Signal code: (-6)
> [localhost:18719] [ 0] /lib/libpthread.so.0 [0x2b9317a17ed0]
> [localhost:18719] [ 1] /lib/libc.so.6(gsignal+0x35) [0x2b9317c563c5]
> [localhost:18719] [ 2] /lib/libc.so.6(abort+0x10e) [0x2b9317c5773e]
> [localhost:18719] [ 3] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b93169710ff]
> [localhost:18719] [ 4] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b931697101d]
> [localhost:18719] [ 5] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b93169710ac]
> [localhost:18719] [ 6] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b9316971a93]
> [localhost:18719] [ 7] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b93169719dd]
> [localhost:18719] [ 8] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b931697297d]
> [localhost:18719] [ 9]
> /usr/local/mpi/ompi-svn/lib/libmpi.so.0(ompi_proc_unpack+0x1ec)
> [0x2b9316972dd9]
> [localhost:18719] [10]
> /usr/local/mpi/ompi-svn/lib/openmpi/mca_dpm_orte.so [0x2b931a80b5cf]
> [localhost:18719] [11]
> /usr/local/mpi/ompi-svn/lib/openmpi/mca_dpm_orte.so [0x2b931a80dad7]
> [localhost:18719] [12] /usr/local/mpi/ompi-svn/lib/libmpi.so.0 [0x2b9316977207]
> [localhost:18719] [13]
> /usr/local/mpi/ompi-svn/lib/libmpi.so.0(PMPI_Init_thread+0x166)
> [0x2b93169b8622]
> [localhost:18719] [14] ./spawn1(main+0x25) [0x400a8d]
> [localhost:18719] [15] /lib/libc.so.6(__libc_start_main+0xf4) [0x2b9317c43b74]
> [localhost:18719] [16] ./spawn1 [0x4009d9]
> [localhost:18719] *** End of error message ***
> --------------------------------------------------------------------------
> mpirun noticed that process rank 0 with PID 18719 on node localhost
> exited on signal 6 (Aborted).
> --------------------------------------------------------------------------
>
> if I change MPI_Init_thread to MPI_Init all works.
> some suggest ?
> The attachments contain my ompi_info (r18077) and config.log.
>
> thanks in advance,
> Joao.
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel