Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] MPI_Recv hangs
From: Jorge Chiva Segura (jordic_at_[hidden])
Date: 2012-05-04 09:20:20


Ups, I edited the code to make it easier to understand but I forgot to
change two p2, sorry ^^' .
I hope this one is completely right:

1: for(int p1=0; p1<np; ++p1) {
2: for(int p2=0; p2<np; ++p2) {
3: if(me==p1) {
4: if(sendSize(p2))
MPI_Ssend(sendBuffer[p2],sendSize(p2),MPI_FLOAT,p2,0,myw); //processor
p1 sends data to processor p2
5: if(recvSize(p2))
MPI_Recv(recvBuffer[p2],recvSize(p2),MPI_FLOAT,p2,0,myw,&status); //processor p1 receives data to processor p2
6: } else if(me==p2) {
7: if(recvSize(p1))
MPI_Recv(recvBuffer[p1],recvSize(p1),MPI_FLOAT,p1,0,myw,&status); //processor p2 receives data to processor p1
8: if(sendSize(p1))
MPI_Ssend(sendBuffer[p1],sendSize(p1),MPI_FLOAT,p1,0,myw); //processor
p2 sends data to processor p1
9: }
10: MPI_Barrier(myw);
11: }
12: }

This is the real code:

                for(int p1=0; p1<mpiS; ++p1) {
                        for(int p2=0; p2<mpiS; ++p2) {
                                if(mpiR==p1) {
                                        sento=p2;
                                        if(s.getMem(sento)){
                                                if(ite>25) cout<<"p1("<<p1<<") enviar "<<sento<<"
"<<s.getMem(sento)<<" FLOATS "<<endl;

ok=MPI_Ssend(s.extractBuffer(sento),s.getMem(sento),MPI_FLOAT,sento,0,myw);
                                                if (ok!=MPI_SUCCESS) cout<<"p1("<<p1<<") enviar "<<sento<<"
"<<s.getMem(sento)<<" PROBLEMS "<<ok<<endl;
                                        }
                                        if(r.getMem(sento)) {
                                                if(ite>25) cout<<"p1("<<p1<<") recibir "<<sento<<"
"<<r.getMem(sento)<<" FLOATS "<<endl;

ok=MPI_Recv(r.extractBuffer(sento),r.getMem(sento),MPI_FLOAT,sento,0,myw,&status);
                                                if (ok!=MPI_SUCCESS) cout<<"p1("<<p1<<") recibir "<<sento<<"
"<<r.getMem(sento)<<" PROBLEMS "<<ok<<endl;
                                        }
                                } else if(mpiR==p2) {
                                        sento=p1;
                                        if(r.getMem(sento)) {
                                                if(ite>25) cout<<"p2("<<p2<<") recibir "<<sento<<"
"<<r.getMem(sento)<<" FLOATS "<<endl;

ok=MPI_Recv(r.extractBuffer(sento),r.getMem(sento),MPI_FLOAT,sento,0,myw,&status);
                                                if (ok!=MPI_SUCCESS) cout<<"p2("<<p2<<") recibir "<<sento<<"
"<<r.getMem(sento)<<" PROBLEMS "<<ok<<endl;
                                        }
                                        if(s.getMem(sento)){
                                                if(ite>25) cout<<"p2("<<p2<<") enviar "<<sento<<"
"<<s.getMem(sento)<<" FLOATS "<<endl;

ok=MPI_Ssend(s.extractBuffer(sento),s.getMem(sento),MPI_FLOAT,sento,0,myw);
                                                if (ok!=MPI_SUCCESS) cout<<"p2("<<p2<<") enviar "<<sento<<"
"<<s.getMem(sento)<<" PROBLEMS "<<ok<<endl;
                                        }
                                }
                                MPI_Barrier(myw);
                        }
                }

Thanks Eduardo

On Fri, 2012-05-04 at 14:58 +0200, Eduardo Morras wrote:

> At 11:52 04/05/2012, you wrote:
> >Hi all,
> >
> >I have a program that executes a communication loop similar to this one:
> >
> >1: for(int p1=0; p1<np; ++p1) {
> >2: for(int p2=0; p2<np; ++p2) {
> >3: if(me==p1) {
> >4: if(sendSize(p2))
> >MPI_Ssend(sendBuffer[p2],sendSize(p2),MPI_FLOAT,p2,0,myw);
> >5: if(recvSize(p2))
> >MPI_Recv(recvBuffer[p2],recvSize(p2),MPI_FLOAT,p2,0,myw,&status);
> >6: } else if(yo==p2) {
> >7: if(recvSize(p1))
> >MPI_Recv(recvBuffer[p1],recvSize(p1),MPI_FLOAT,p2,0,myw,&status);
> >8: if(sendSize(p1))
> >MPI_Ssend(sendBuffer[p1],sendSize(p1),MPI_FLOAT,p2,0,myw);
> >9: }
> >10: MPI_Barrier(myw);
> >11: }
> >12: }
> >
> >The program is an iterative process that makes some calculations,
> >communicates and then continues with the next iteration. The problem
> >is that after making 30 successful iterations the program hangs.
> >With padb I have seen that one of the processors waits at line 5 for
> >the reception of data that was already sent and the rest of the
> >processors are waiting at the barrier in line 10. The size of the
> >messages and buffers is the same for all the iterations.
> >
> >My real program makes use of asynchronous communications for obvious
> >performance reasons and it worked without problems when the case to
> >solve was smaller (lower number of processors and memory), but I
> >found that for this case the program hanged and that is why a
> >changed the communication routine using synchronous communications
> >to see where is the problem. Now I know where the program hangs, but
> >I don't understand what I am doing wrong.
> >
> >Any suggestions?
>
> All messages has p2 as destination. So, p1 is waiting for a message
> that hasn't been sended for him. He shouldn't be waiting any
> messages. Don't know the logic of your program, so can't tell more
> suggestions or clues.
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
Aquest missatge ha estat analitzat per MailScanner
a la cerca de virus i d'altres continguts perillosos,
i es considera que està net.