I think Bill is right. Here is the description for mpi_finalize:
This routine cleans up all MPI states. Once this routine is called, no
MPI routine (not even MPI_Init) may be called, except for
MPI_Get_version,
MPI_Initialized, and MPI_Finalized. Unless there has been a call to
MPI_Abort, you must ensure that all pending communications involving a
process are complete
before the process calls MPI_Finalize. If the call returns, each process
may either continue local computations or exit without participating in
further
communication with other processes. At the moment when the last process
calls MPI_Finalize, all pending sends must be matched by a receive, and
all pending
receives must be matched by a send.
So I believe what Bill is alluding to is that after you called the second Isend, your receive side hasn't posted the second Irecv; thus when mpi_finalize is called on the send side, the message got thrown out. When your receive side does get to the second Irecv, it is waiting for a message that'll never arrive.
Try putting an "MPI_Barrier()" call before your MPI_Finalize() [*]. I suspect that one of the programs (the sending side) is calling Finalize before the receiving side has processed the messages.
-bill
[*] pet peeve of mine : this should almost always be standard practice.
> -----Original Message-----
> From: users-bounces@open-mpi.org [mailto:users-bounces@open-mpi.org] On
> Behalf Of Xianglong Kong
> Sent: Tuesday, February 22, 2011 10:27 AM
> To: Open MPI Users
> Subject: Re: [OMPI users] Beginner's question: why multiple sends or
> receives don't work?
>
> Hi, Thank you for the reply.
>
> However, using MPI_waitall instead of MPI_wait didn't solve the
> problem. The code would hang at the MPI_waitall. Also, I'm not quit
> understand why the code is inherently unsafe. Can the non-blocking
> send or receive cause any deadlock?
>
> Thanks!
>
> Kong
>
> On Mon, Feb 21, 2011 at 2:32 PM, Jeff Squyres <jsquyres@cisco.com>
> wrote:
> > It's because you're waiting on the receive request to complete before
> the send request. This likely works locally because the message
> transfer is through shared memory and is fast, but it's still an
> inherently unsafe way to block waiting for completion (i.e., the
> receive might not complete if the send does not complete).
> >
> > What you probably want to do is build an array of 2 requests and then
> issue a single MPI_Waitall() on both of them. This will allow MPI to
> progress both requests simultaneously.
> >
> >
> > On Feb 18, 2011, at 11:58 AM, Xianglong Kong wrote:
> >
> >> Hi, all,
> >>
> >> I'm an mpi newbie. I'm trying to connect two desktops in my office
> >> with each other using a crossing cable and implement a parallel code
> >> on them using MPI.
> >>
> >> Now, the two nodes can ssh to each other without password, and can
> >> successfully run the MPI "Hello world" code. However, when I tried
> to
> >> use multiple MPI non-blocking sends or receives, the job would hang.
> >> The problem only showed up if the two processes are launched in the
> >> different nodes, the code can run successfully if the two processes
> >> are launched in the same node. Also, the code can run successfully
> if
> >> there are only one send or/and one receive in each process.
> >>
> >> Here is the code that can run successfully:
> >>
> >> #include <stdlib.h>
> >> #include <stdio.h>
> >> #include <string.h>
> >> #include <mpi.h>
> >>
> >> int main(int argc, char** argv) {
> >>
> >> int myrank, nprocs;
> >>
> >> MPI_Init(&argc, &argv);
> >> MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
> >> MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
> >>
> >> printf("Hello from processor %d of %d\n", myrank, nprocs);
> >>
> >> MPI_Request reqs1, reqs2;
> >> MPI_Status stats1, stats2;
> >>
> >> int tag1=10;
> >> int tag2=11;
> >>
> >> int buf;
> >> int mesg;
> >> int source=1-myrank;
> >> int dest=1-myrank;
> >>
> >> if(myrank==0)
> >> {
> >> mesg=1;
> >>
> >> MPI_Irecv(&buf, 1, MPI_INT, source, tag1,
> MPI_COMM_WORLD, &reqs1);
> >> MPI_Isend(&mesg, 1, MPI_INT, dest, tag2,
> MPI_COMM_WORLD, &reqs2);
> >>
> >>
> >> }
> >>
> >> if(myrank==1)
> >> {
> >> mesg=2;
> >>
> >> MPI_Irecv(&buf, 1, MPI_INT, source, tag2,
> MPI_COMM_WORLD, &reqs1);
> >> MPI_Isend(&mesg, 1, MPI_INT, dest, tag1,
> MPI_COMM_WORLD, &reqs2);
> >> }
> >>
> >> MPI_Wait(&reqs1, &stats1);
> >> printf("myrank=%d,received the message\n",myrank);
> >>
> >> MPI_Wait(&reqs2, &stats2);
> >> printf("myrank=%d,sent the messages\n",myrank);
> >>
> >> printf("myrank=%d, buf=%d\n",myrank, buf);
> >>
> >> MPI_Finalize();
> >> return 0;
> >> }
> >>
> >> And here is the code that will hang
> >>
> >> #include <stdlib.h>
> >> #include <stdio.h>
> >> #include <string.h>
> >> #include <mpi.h>
> >>
> >> int main(int argc, char** argv) {
> >>
> >> int myrank, nprocs;
> >>
> >> MPI_Init(&argc, &argv);
> >> MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
> >> MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
> >>
> >> printf("Hello from processor %d of %d\n", myrank, nprocs);
> >>
> >> MPI_Request reqs1, reqs2;
> >> MPI_Status stats1, stats2;
> >>
> >> int tag1=10;
> >> int tag2=11;
> >>
> >> int source=1-myrank;
> >> int dest=1-myrank;
> >>
> >> if(myrank==0)
> >> {
> >> int buf1, buf2;
> >>
> >> MPI_Irecv(&buf1, 1, MPI_INT, source, tag1,
> MPI_COMM_WORLD, &reqs1);
> >> MPI_Irecv(&buf2, 1, MPI_INT, source, tag2,
> MPI_COMM_WORLD, &reqs2);
> >>
> >> MPI_Wait(&reqs1, &stats1);
> >> printf("received one message\n");
> >>
> >> MPI_Wait(&reqs2, &stats2);
> >> printf("received two messages\n");
> >>
> >> printf("myrank=%d, buf1=%d, buf2=%d\n",myrank, buf1,
> buf2);
> >> }
> >>
> >> if(myrank==1)
> >> {
> >> int mesg1=1;
> >> int mesg2=2;
> >>
> >> MPI_Isend(&mesg1, 1, MPI_INT, dest, tag1,
> MPI_COMM_WORLD, &reqs1);
> >> MPI_Isend(&mesg2, 1, MPI_INT, dest, tag2,
> MPI_COMM_WORLD, &reqs2);
> >>
> >> MPI_Wait(&reqs1, &stats1);
> >> printf("sent one message\n");
> >>
> >> MPI_Wait(&reqs2, &stats2);
> >> printf("sent two messages\n");
> >> }
> >>
> >> MPI_Finalize();
> >> return 0;
> >> }
> >>
> >> And the output of the second failed code:
> >> ***********************************************
> >> Hello from processor 0 of 2
> >>
> >> Received one message
> >>
> >> Hello from processor 1 of 2
> >>
> >> Sent one message
> >> *******************************************************
> >>
> >> Can anyone help to point out why the second code didn't work?
> >>
> >> Thanks!
> >>
> >> Kong
> >>
> >> _______________________________________________
> >> users mailing list
> >> users@open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> > --
> > Jeff Squyres
> > jsquyres@cisco.com
> > For corporate legal information go to:
> > http://www.cisco.com/web/about/doing_business/legal/cri/
> >
> >
> > _______________________________________________
> > users mailing list
> > users@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
>
>
>
> --
> Xianglong Kong
> Department of Mechanical Engineering
> University of Rochester
> Phone: (585)520-4412
> MSN: dinosaur8312@hotmail.com
>
> _______________________________________________
> users mailing list
> users@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users