Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Multi-threading with OpenMPI ?
From: Ashika Umanga Umagiliya (aumanga_at_[hidden])
Date: 2009-09-13 21:50:42


Greetings all,

After some reading , I found out that I have to build openMPI using
"--enable-mpi-threads"
After thatm I changed MPI_INIT() code in my "libParallel.so" and in
"parallel-svr" (please refer to http://i27.tinypic.com/mtqurp.jpg ) to :

  int sup;
 MPI_Init_thread(NULL,NULL,MPI_THREAD_MULTIPLE,&sup);

Now when multiple requests comes (multiple threads) MPI gives following
two errors:

"<stddiag rank="0">[umanga:06127] [[8004,1],0] ORTE_ERROR_LOG: Data
unpack would read past end of buffer in file dpm_orte.c at line
299</stddiag>
[umanga:6127] *** An error occurred in MPI_Comm_spawn
[umanga:6127] *** on communicator MPI_COMM_SELF
[umanga:6127] *** MPI_ERR_UNKNOWN: unknown error
[umanga:6127] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[umanga:06126] [[8004,0],0]-[[8004,1],0] mca_oob_tcp_msg_recv: readv
failed: Connection reset by peer (104)
--------------------------------------------------------------------------
mpirun has exited due to process rank 0 with PID 6127 on
node umanga exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
"

or sometimes :

"[umanga:5477] *** An error occurred in MPI_Comm_spawn
[umanga:5477] *** on communicator MPI_COMM_SELF
[umanga:5477] *** MPI_ERR_UNKNOWN: unknown error
[umanga:5477] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
<stddiag rank="0">[umanga:05477] [[7630,1],0] ORTE_ERROR_LOG: Data
unpack would read past end of buffer in file dpm_orte.c at line
299</stddiag>
--------------------------------------------------------------------------
mpirun has exited due to process rank 0 with PID 5477 on
node umanga exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------"

Any tips ?

Thank you

Ashika Umanga Umagiliya wrote:
> Greetings all,
>
> Please refer to image at:
> http://i27.tinypic.com/mtqurp.jpg
>
> Here the process illustrated in the image:
>
> 1) C++ Webservice loads the "libParallel.so" when it starts up. (dlopen)
> 2) When a new request comes from a client,*new thread* is created,
> SOAP data is bound to C++ objects and calcRisk() method of webservice
> invoked.Inside this method, "calcRisk()" of "libParallel" is invoked
> (using dlsym ..etc)
> 3) Inside "calcRisk()" of "libParallel" ,it spawns "parallel-svr" MPI
> application.
> (I am using boost MPI and boost serializarion to send
> custom-data-types across spawned processes.)
> 4) "parallel-svr" (MPI Application in image) execute the parallel
> logic and send the result back to "libParallel.so" using boost MPI
> send..etc.
> 5) "libParallel.so" send the result to webservice,bind into SOAP and
> sent result to client and the thread ends.
>
> My problem is :
>
> Everthing works fine for the first request from the client,
> For the second request it throws an error (i assume from
> libParallel.so") saying:
>
> "--------------------------------------------------------------------------
>
> Calling any MPI-function after calling MPI_Finalize is erroneous.
> The only exceptions are MPI_Initialized, MPI_Finalized and
> MPI_Get_version.
> --------------------------------------------------------------------------
>
> *** An error occurred in MPI_Init
> *** after MPI was finalized
> *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
> [umanga:19390] Abort after MPI_FINALIZE completed successfully; not
> able to guarantee that all other processes were killed!"
>
>
> Is this because of multithreading ? Any idea how to fix this ?
>
> Thanks in advance,
> umanga
>