Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Problem with MPI_Request, MPI_Isend/recv and MPI_Wait/Test
From: George Bosilca (bosilca_at_[hidden])
Date: 2011-05-19 12:48:03


David,

I do not see any mechanism for protecting the accesses to the requests to a single thread? What is the thread model you're using?

>From an implementation perspective, your code is correct only if you initialize the MPI library with MPI_THREAD_MULTIPLE and if the library accepts. Otherwise, there is an assumption that the application is single threaded, or that the MPI behavior is implementation dependent. Please read the MPI standard regarding to MPI_Init_thread for more details.

Regards,
  george.

On May 19, 2011, at 02:34 , David Büttner wrote:

> Hello,
>
> I am working on a hybrid MPI (OpenMPI 1.4.3) and Pthread code. I am using MPI_Isend and MPI_Irecv for communication and MPI_Test/MPI_Wait to check if it is done. I do this repeatedly in the outer loop of my code. The MPI_Test is used in the inner loop to check if some function can be called which depends on the received data.
> The program regularly crashed (only when not using printf...) and after debugging it I figured out the following problem:
>
> In MPI_Isend I have an invalid read of memory. I fixed the problem with not re-using a
>
> MPI_Request req_s, req_r;
>
> but by using
>
> MPI_Request* req_s;
> MPI_Request* req_r
>
> and re-allocating them before the MPI_Isend/recv.
>
> The documentation says, that in MPI_Wait and MPI_Test (if successful) the request-objects are deallocated and set to MPI_REQUEST_NULL.
> It also says, that in MPI_Isend and MPI_Irecv, it allocates the Objects and associates it with the request object.
>
> As I understand this, this either means I can use a pointer to MPI_Request which I don't have to initialize for this (it doesn't work but crashes), or that I can use a MPI_Request pointer which I have initialized with malloc(sizeof(MPI_REQUEST)) (or passing the address of a MPI_Request req), which is set and unset in the functions. But this version crashes, too.
> What works is using a pointer, which I allocate before the MPI_Isend/recv and which I free after MPI_Wait in every iteration. In other words: It only uses if I don't reuse any kind of MPI_Request. Only if I recreate one every time.
>
> Is this, what is should be like? I believe that a reuse of the memory would be a lot more efficient (less calls to malloc...). Am I missing something here? Or am I doing something wrong?
>
>
> Let me provide some more detailed information about my problem:
>
> I am running the program on a 30 node infiniband cluster. Each node has 4 single core Opteron CPUs. I am running 1 MPI Rank per node and 4 threads per rank (-> one thread per core).
> I am compiling with mpicc of OpenMPI using gcc below.
> Some pseudo-code of the program can be found at the end of this e-mail.
>
> I was able to reproduce the problem using different amount of nodes and even using one node only. The problem does not arise when I put printf-debugging information into the code. This pointed me into the direction that I have some memory problem, where some write accesses some memory it is not supposed to.
> I ran the tests using valgrind with --leak-check=full and --show-reachable=yes, which pointed me either to MPI_Isend or MPI_Wait depending on whether I had the threads spin in a loop for MPI_Test to return success or used MPI_Wait respectively.
>
> I would appreciate your help with this. Am I missing something important here? Is there a way to re-use the request in the different iterations other than I thought it should work?
> Or is there a way to re-initialize the allocated memory before the MPI_Isend/recv so that I at least don't have to call free and malloc each time?
>
> Thank you very much for your help!
> Kind regards,
> David Büttner
>
> _____________________
> Pseudo-Code of program:
>
> MPI_Request* req_s;
> MPI_Request* req_w;
> OUTER-LOOP
> if(0 == threadid)
> {
> req_s = malloc(sizeof(MPI_Request));
> req_r = malloc(sizeof(MPI_Request));
> MPI_Isend(..., req_s)
> MPI_Irecv(..., req_r)
> }
> pthread_barrier
> INNER-LOOP (while NOT_DONE or RET)
> if(TRYLOCK && NOT_DONE)
> {
> if(MPI_TEST(req_r))
> {
> Call_Function_A;
> NOT_DONE = 0;
> }
>
> }
> RET = Call_Function_B;
> }
> pthread_barrier_wait
> if(0 == threadid)
> {
> MPI_WAIT(req_s)
> MPI_WAIT(req_r)
> free(req_s);
> free(req_r);
> }
> _____________
>
>
> --
> David Büttner, Informatik, Technische Universität München
> TUM I-10 - FMI 01.06.059 - Tel. 089 / 289-17676
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

"To preserve the freedom of the human mind then and freedom of the press, every spirit should be ready to devote itself to martyrdom; for as long as we may think as we will, and speak as we think, the condition of man will proceed in improvement."
  -- Thomas Jefferson, 1799