Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Beginner's question: why multiple sends or receives don't work?
From: David Zhang (solarbikedz_at_[hidden])
Date: 2011-02-22 11:39:55


I think Bill is right. Here is the description for mpi_finalize:

This routine cleans up all MPI states. Once this routine is called, no MPI
routine (not even MPI_Init) may be called, except for MPI_Get_version,
MPI_Initialized, and MPI_Finalized. Unless there has been a call to
MPI_Abort, you must ensure that all pending communications involving a
process are complete before the process calls MPI_Finalize. If the call
returns, each process may either continue local computations or exit without
participating in further communication with other processes. *At the moment
when the last process calls MPI_Finalize, all pending sends must be matched
by a receive, and all pending receives must be matched by a send. *

So I believe what Bill is alluding to is that after you called the second
Isend, your receive side hasn't posted the second Irecv; thus when
mpi_finalize is called on the send side, the message got thrown out. When
your receive side does get to the second Irecv, it is waiting for a message
that'll never arrive.

On Tue, Feb 22, 2011 at 8:06 AM, Bill Rankin <Bill.Rankin_at_[hidden]> wrote:

> Try putting an "MPI_Barrier()" call before your MPI_Finalize() [*]. I
> suspect that one of the programs (the sending side) is calling Finalize
> before the receiving side has processed the messages.
>
> -bill
>
> [*] pet peeve of mine : this should almost always be standard practice.
>
>
> > -----Original Message-----
> > From: users-bounces_at_[hidden] [mailto:users-bounces_at_[hidden]] On
> > Behalf Of Xianglong Kong
> > Sent: Tuesday, February 22, 2011 10:27 AM
> > To: Open MPI Users
> > Subject: Re: [OMPI users] Beginner's question: why multiple sends or
> > receives don't work?
> >
> > Hi, Thank you for the reply.
> >
> > However, using MPI_waitall instead of MPI_wait didn't solve the
> > problem. The code would hang at the MPI_waitall. Also, I'm not quit
> > understand why the code is inherently unsafe. Can the non-blocking
> > send or receive cause any deadlock?
> >
> > Thanks!
> >
> > Kong
> >
> > On Mon, Feb 21, 2011 at 2:32 PM, Jeff Squyres <jsquyres_at_[hidden]>
> > wrote:
> > > It's because you're waiting on the receive request to complete before
> > the send request. This likely works locally because the message
> > transfer is through shared memory and is fast, but it's still an
> > inherently unsafe way to block waiting for completion (i.e., the
> > receive might not complete if the send does not complete).
> > >
> > > What you probably want to do is build an array of 2 requests and then
> > issue a single MPI_Waitall() on both of them. This will allow MPI to
> > progress both requests simultaneously.
> > >
> > >
> > > On Feb 18, 2011, at 11:58 AM, Xianglong Kong wrote:
> > >
> > >> Hi, all,
> > >>
> > >> I'm an mpi newbie. I'm trying to connect two desktops in my office
> > >> with each other using a crossing cable and implement a parallel code
> > >> on them using MPI.
> > >>
> > >> Now, the two nodes can ssh to each other without password, and can
> > >> successfully run the MPI "Hello world" code. However, when I tried
> > to
> > >> use multiple MPI non-blocking sends or receives, the job would hang.
> > >> The problem only showed up if the two processes are launched in the
> > >> different nodes, the code can run successfully if the two processes
> > >> are launched in the same node. Also, the code can run successfully
> > if
> > >> there are only one send or/and one receive in each process.
> > >>
> > >> Here is the code that can run successfully:
> > >>
> > >> #include <stdlib.h>
> > >> #include <stdio.h>
> > >> #include <string.h>
> > >> #include <mpi.h>
> > >>
> > >> int main(int argc, char** argv) {
> > >>
> > >> int myrank, nprocs;
> > >>
> > >> MPI_Init(&argc, &argv);
> > >> MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
> > >> MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
> > >>
> > >> printf("Hello from processor %d of %d\n", myrank, nprocs);
> > >>
> > >> MPI_Request reqs1, reqs2;
> > >> MPI_Status stats1, stats2;
> > >>
> > >> int tag1=10;
> > >> int tag2=11;
> > >>
> > >> int buf;
> > >> int mesg;
> > >> int source=1-myrank;
> > >> int dest=1-myrank;
> > >>
> > >> if(myrank==0)
> > >> {
> > >> mesg=1;
> > >>
> > >> MPI_Irecv(&buf, 1, MPI_INT, source, tag1,
> > MPI_COMM_WORLD, &reqs1);
> > >> MPI_Isend(&mesg, 1, MPI_INT, dest, tag2,
> > MPI_COMM_WORLD, &reqs2);
> > >>
> > >>
> > >> }
> > >>
> > >> if(myrank==1)
> > >> {
> > >> mesg=2;
> > >>
> > >> MPI_Irecv(&buf, 1, MPI_INT, source, tag2,
> > MPI_COMM_WORLD, &reqs1);
> > >> MPI_Isend(&mesg, 1, MPI_INT, dest, tag1,
> > MPI_COMM_WORLD, &reqs2);
> > >> }
> > >>
> > >> MPI_Wait(&reqs1, &stats1);
> > >> printf("myrank=%d,received the message\n",myrank);
> > >>
> > >> MPI_Wait(&reqs2, &stats2);
> > >> printf("myrank=%d,sent the messages\n",myrank);
> > >>
> > >> printf("myrank=%d, buf=%d\n",myrank, buf);
> > >>
> > >> MPI_Finalize();
> > >> return 0;
> > >> }
> > >>
> > >> And here is the code that will hang
> > >>
> > >> #include <stdlib.h>
> > >> #include <stdio.h>
> > >> #include <string.h>
> > >> #include <mpi.h>
> > >>
> > >> int main(int argc, char** argv) {
> > >>
> > >> int myrank, nprocs;
> > >>
> > >> MPI_Init(&argc, &argv);
> > >> MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
> > >> MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
> > >>
> > >> printf("Hello from processor %d of %d\n", myrank, nprocs);
> > >>
> > >> MPI_Request reqs1, reqs2;
> > >> MPI_Status stats1, stats2;
> > >>
> > >> int tag1=10;
> > >> int tag2=11;
> > >>
> > >> int source=1-myrank;
> > >> int dest=1-myrank;
> > >>
> > >> if(myrank==0)
> > >> {
> > >> int buf1, buf2;
> > >>
> > >> MPI_Irecv(&buf1, 1, MPI_INT, source, tag1,
> > MPI_COMM_WORLD, &reqs1);
> > >> MPI_Irecv(&buf2, 1, MPI_INT, source, tag2,
> > MPI_COMM_WORLD, &reqs2);
> > >>
> > >> MPI_Wait(&reqs1, &stats1);
> > >> printf("received one message\n");
> > >>
> > >> MPI_Wait(&reqs2, &stats2);
> > >> printf("received two messages\n");
> > >>
> > >> printf("myrank=%d, buf1=%d, buf2=%d\n",myrank, buf1,
> > buf2);
> > >> }
> > >>
> > >> if(myrank==1)
> > >> {
> > >> int mesg1=1;
> > >> int mesg2=2;
> > >>
> > >> MPI_Isend(&mesg1, 1, MPI_INT, dest, tag1,
> > MPI_COMM_WORLD, &reqs1);
> > >> MPI_Isend(&mesg2, 1, MPI_INT, dest, tag2,
> > MPI_COMM_WORLD, &reqs2);
> > >>
> > >> MPI_Wait(&reqs1, &stats1);
> > >> printf("sent one message\n");
> > >>
> > >> MPI_Wait(&reqs2, &stats2);
> > >> printf("sent two messages\n");
> > >> }
> > >>
> > >> MPI_Finalize();
> > >> return 0;
> > >> }
> > >>
> > >> And the output of the second failed code:
> > >> ***********************************************
> > >> Hello from processor 0 of 2
> > >>
> > >> Received one message
> > >>
> > >> Hello from processor 1 of 2
> > >>
> > >> Sent one message
> > >> *******************************************************
> > >>
> > >> Can anyone help to point out why the second code didn't work?
> > >>
> > >> Thanks!
> > >>
> > >> Kong
> > >>
> > >> _______________________________________________
> > >> users mailing list
> > >> users_at_[hidden]
> > >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >
> > >
> > > --
> > > Jeff Squyres
> > > jsquyres_at_[hidden]
> > > For corporate legal information go to:
> > > http://www.cisco.com/web/about/doing_business/legal/cri/
> > >
> > >
> > > _______________________________________________
> > > users mailing list
> > > users_at_[hidden]
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >
> >
> >
> >
> > --
> > Xianglong Kong
> > Department of Mechanical Engineering
> > University of Rochester
> > Phone: (585)520-4412
> > MSN: dinosaur8312_at_[hidden]
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
David Zhang
University of California, San Diego