Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Beginner's question: why multiple sends or receives don't work?
From: Xianglong Kong (dinosaur8312_at_[hidden])
Date: 2011-02-22 10:26:35


Hi, Thank you for the reply.

However, using MPI_waitall instead of MPI_wait didn't solve the
problem. The code would hang at the MPI_waitall. Also, I'm not quit
understand why the code is inherently unsafe. Can the non-blocking
send or receive cause any deadlock?

Thanks!

Kong

On Mon, Feb 21, 2011 at 2:32 PM, Jeff Squyres <jsquyres_at_[hidden]> wrote:
> It's because you're waiting on the receive request to complete before the send request.  This likely works locally because the message transfer is through shared memory and is fast, but it's still an inherently unsafe way to block waiting for completion (i.e., the receive might not complete if the send does not complete).
>
> What you probably want to do is build an array of 2 requests and then issue a single MPI_Waitall() on both of them.  This will allow MPI to progress both requests simultaneously.
>
>
> On Feb 18, 2011, at 11:58 AM, Xianglong Kong wrote:
>
>> Hi, all,
>>
>> I’m an mpi newbie. I’m trying to connect two desktops in my office
>> with each other using a crossing cable and implement a parallel code
>> on them using MPI.
>>
>> Now, the two nodes can ssh to each other without password, and can
>> successfully run the MPI “Hello world” code. However, when I tried to
>> use multiple MPI non-blocking sends or receives, the job would hang.
>> The problem only showed up if the two processes are launched in the
>> different nodes, the code can run successfully if the two processes
>> are launched in the same node. Also, the code can run successfully if
>> there are only one send or/and one receive in each process.
>>
>> Here is the code that can run successfully:
>>
>> #include <stdlib.h>
>> #include <stdio.h>
>> #include <string.h>
>> #include <mpi.h>
>>
>> int main(int argc, char** argv) {
>>
>>       int myrank, nprocs;
>>
>>       MPI_Init(&argc, &argv);
>>       MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
>>       MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
>>
>>       printf("Hello from processor %d of %d\n", myrank, nprocs);
>>
>>       MPI_Request reqs1, reqs2;
>>       MPI_Status stats1, stats2;
>>
>>       int tag1=10;
>>       int tag2=11;
>>
>>       int buf;
>>       int mesg;
>>       int source=1-myrank;
>>       int dest=1-myrank;
>>
>>       if(myrank==0)
>>       {
>>               mesg=1;
>>
>>               MPI_Irecv(&buf, 1, MPI_INT, source, tag1, MPI_COMM_WORLD, &reqs1);
>>               MPI_Isend(&mesg, 1, MPI_INT, dest,  tag2, MPI_COMM_WORLD, &reqs2);
>>
>>
>>       }
>>
>>       if(myrank==1)
>>       {
>>               mesg=2;
>>
>>               MPI_Irecv(&buf, 1, MPI_INT, source, tag2, MPI_COMM_WORLD, &reqs1);
>>               MPI_Isend(&mesg, 1, MPI_INT,  dest, tag1, MPI_COMM_WORLD, &reqs2);
>>       }
>>
>>       MPI_Wait(&reqs1, &stats1);
>>       printf("myrank=%d,received the message\n",myrank);
>>
>>       MPI_Wait(&reqs2, &stats2);
>>       printf("myrank=%d,sent the messages\n",myrank);
>>
>>       printf("myrank=%d, buf=%d\n",myrank, buf);
>>
>>       MPI_Finalize();
>>       return 0;
>> }
>>
>> And here is the code that will hang
>>
>> #include <stdlib.h>
>> #include <stdio.h>
>> #include <string.h>
>> #include <mpi.h>
>>
>> int main(int argc, char** argv) {
>>
>>       int myrank, nprocs;
>>
>>       MPI_Init(&argc, &argv);
>>       MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
>>       MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
>>
>>       printf("Hello from processor %d of %d\n", myrank, nprocs);
>>
>>       MPI_Request reqs1, reqs2;
>>       MPI_Status stats1, stats2;
>>
>>       int tag1=10;
>>       int tag2=11;
>>
>>       int source=1-myrank;
>>       int dest=1-myrank;
>>
>>       if(myrank==0)
>>       {
>>               int buf1, buf2;
>>
>>               MPI_Irecv(&buf1, 1, MPI_INT, source, tag1, MPI_COMM_WORLD, &reqs1);
>>               MPI_Irecv(&buf2, 1, MPI_INT, source, tag2, MPI_COMM_WORLD, &reqs2);
>>
>>               MPI_Wait(&reqs1, &stats1);
>>               printf("received one message\n");
>>
>>               MPI_Wait(&reqs2, &stats2);
>>               printf("received two messages\n");
>>
>>               printf("myrank=%d, buf1=%d, buf2=%d\n",myrank, buf1, buf2);
>>       }
>>
>>       if(myrank==1)
>>       {
>>               int mesg1=1;
>>               int mesg2=2;
>>
>>               MPI_Isend(&mesg1, 1, MPI_INT, dest, tag1, MPI_COMM_WORLD, &reqs1);
>>               MPI_Isend(&mesg2, 1, MPI_INT, dest, tag2, MPI_COMM_WORLD, &reqs2);
>>
>>               MPI_Wait(&reqs1, &stats1);
>>               printf("sent one message\n");
>>
>>               MPI_Wait(&reqs2, &stats2);
>>               printf("sent two messages\n");
>>       }
>>
>>       MPI_Finalize();
>>       return 0;
>> }
>>
>> And the output of the second failed code:
>> ***********************************************
>> Hello from processor 0 of 2
>>
>> Received one message
>>
>> Hello from processor 1 of 2
>>
>> Sent one message
>> *******************************************************
>>
>> Can anyone help to point out why the second code didn't work?
>>
>> Thanks!
>>
>> Kong
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
Xianglong Kong
Department of Mechanical Engineering
University of Rochester
Phone: (585)520-4412
MSN: dinosaur8312_at_[hidden]