Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] MPI_Recv hangs
From: Jorge Chiva Segura (jordic_at_[hidden])
Date: 2012-05-04 11:41:39


On Fri, 2012-05-04 at 16:44 +0200, Eduardo Morras wrote:
> At 15:20 04/05/2012, you wrote:
> >Ups, I edited the code to make it easier to understand but I forgot
> >to change two p2, sorry ^^' .
> >I hope this one is completely right:
> >
> >1: for(int p1=0; p1<np; ++p1) {
> >2: for(int p2=0; p2<np; ++p2) {
> >3: if(me==p1) {
> >4: if(sendSize(p2))
> >MPI_Ssend(sendBuffer[p2],sendSize(p2),MPI_FLOAT,p2,0,myw);
> >//processor p1 sends data to processor p2
> >5: if(recvSize(p2))
> >MPI_Recv(recvBuffer[p2],recvSize(p2),MPI_FLOAT,p2,0,myw,&status);
> >//processor p1 receives data to processor p2
> >6: } else if(me==p2) {
> >7: if(recvSize(p1))
> >MPI_Recv(recvBuffer[p1],recvSize(p1),MPI_FLOAT,p1,0,myw,&status);
> >//processor p2 receives data to processor p1
> >8: if(sendSize(p1))
> >MPI_Ssend(sendBuffer[p1],sendSize(p1),MPI_FLOAT,p1,0,myw);
> >//processor p2 sends data to processor p1
> >9: }
> >10: MPI_Barrier(myw);
> >11: }
> >12: }
>
>
> Now p1 will send messages to p2 and receive messages from p2
>
> Now p2 will send messages to p1 and receive messages from p1
>
> The logic of send/recv looks ok. Now, in 5 and 7, recvSize(p2) and
> recvSize(p1) function what value returns?
All the sendSizes and RecvSizes are constant between iterations and are
calculated as a setup before all the calculations start.

The function recvSize(p) returns the size of the message that I should
receive from processor p, so in the case of line 5 recvSize(p2) returns
the size of the message that I (me,p1) should receive from processor p2
and at line 7 recvSize(p1) returns the size of the message that I
(me,p2) should receive from processor p1. Something similar happens with
sendSize(p). In an hypothetic smaller case involving 3 processors the
values could be:
>From processor 0:
sendSize(0) ==> 0
sendSize(1) ==> 11
sendSize(2) ==> 22
recvSize(0) ==> 0
recvSize(1) ==> 33
recvSize(2) ==> 44
>From processor 1:
sendSize(0) ==> 33
sendSize(1) ==> 0
sendSize(2) ==> 55
recvSize(0) ==> 11
recvSize(1) ==> 0
recvSize(2) ==> 66
>From processor 2:
sendSize(0) ==> 44
sendSize(1) ==> 66
sendSize(2) ==> 0
recvSize(0) ==> 22
recvSize(1) ==> 55
recvSize(2) ==> 0

The main thing here is that from processor p1 the sendSize(p2) should
match the recvSize(p1) from processor p2.
> The size of the buffer
> received from the MPI_Recv done in previous for loop?
There is no relation between the loops. In each loop only two processors (p1 and p2)
communicate some data between them (first p1 sends data to p2 and after p2 sends data to p1) and the
rest of the processors will be waiting in the Barrier.

>
>
> >This is the real code:
> >
> >for(int p1=0; p1<mpiS; ++p1) {
> >for(int p2=0; p2<mpiS; ++p2) {
> >if(mpiR==p1) {
> >sento=p2;
> >if(s.getMem(sento)){
> >if(ite>25) cout<<"p1("<<p1<<") enviar "<<sento<<"
> >"<<s.getMem(sento)<<" FLOATS "<<endl;
> >ok=MPI_Ssend(s.extractBuffer(sento),s.getMem(sento),MPI_FLOAT,sento,0,myw);
>
> Don't know what are you doing here, second parameter, s.getMem(sento)
> should be the size of the buffer.
And it is the size of the buffer.

>
> MPI_Ssend is defined for c++ :
>
> void Comm::Ssend(const void* buf, int count, const Datatype&
> datatype, int dest, int tag) const
>
> and you are using the C call. Are you mixing c and c++ code? Be
> careful with that.

> The rest of your code has the same problems, check them. Perhaps you
> need a tutorial, check
> http://www.mpitutorial.com/beginner-mpi-tutorial/ , it's for mpich
> but is mpi-flavourless, so it works with openmpi too.
Thanks for the tutorial, I will check it and it is good that it is for
mpich because now I'm testing the program with it and other versions of
mpi, but I'm not really sure that this is a beginner problem.

I found a way to get my program running without hangs:
http://www.open-mpi.org/community/lists/users/2012/05/19188.php
http://www.open-mpi.org/community/lists/users/2012/05/19185.php

Do you know what could cause the program to hang with the default value
(310) and to work fine with 305? I also tested it with 311 but it hanged
so it seems that it is not enough to activate the SEND flag.

> >Thanks Eduardo
>
> HTH and happy coding

Thanks for your help

-- 
Aquest missatge ha estat analitzat per MailScanner
a la cerca de virus i d'altres continguts perillosos,
i es considera que està net.