Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2007-10-17 17:25:25


Can you send a short test program that shows this problem, perchance?

On Oct 3, 2007, at 1:41 PM, Daniel Rozenbaum wrote:

> Hi again,
>
> I'm trying to debug the problem I posted on several times recently;
> I thought I'd try asking a more focused question:
>
> I have the following sequence in the client code:
> MPI_Status stat;
> ret = MPI_Probe(0, MPI_ANY_TAG, MPI_COMM_WORLD, &stat);
> assert(ret == MPI_SUCCESS);
> ret = MPI_Get_elements(&stat, MPI_BYTE, &count);
> assert(ret == MPI_SUCCESS);
> char *buffer = malloc(count);
> assert(buffer != NULL);
> ret = MPI_Recv((void *)buffer, count, MPI_BYTE, 0, stat.MPI_TAG,
> MPI_COMM_WORLD, MPI_STATUS_IGNORE);
> assert(ret == MPI_SUCCESS);
> fprintf(stderr, "MPI_Recv done\n");
> <proceed to taking action on the received buffer, send response to
> server>
> Each MPI_ call in the lines above is surrounded by debug prints
> that print out the client's rank, current time, the action about to
> be taken with all its parameters' values, and the action's result.
> After the first cycle (receive message from server -- process it --
> send response -- wait for next message) works out as expected, the
> next cycle get stuck in MPI_Recv. What I get in my debug prints is
> more or less the following:
> MPI_Probe(source= 0, tag= MPI_ANY_TAG, comm= MPI_COMM_WORKD,
> status= <address1>)
> MPI_Probe done, source= 0, tag= 2, error= 0
> MPI_Get_elements(status= <address1>, dtype= MPI_BYTE, count=
> <address2>)
> MPI_Get_elements done, count= 2731776
> MPI_Recv(buf= <address3>, count= 2731776, dtype= MPI_BYTE, src= 0,
> tag= 2, comm= MPI_COMM_WORLD, stat= MPI_STATUS_IGNORE)
> <nothing beyond this point. Some time afterwards there're "readv
> failed" errors in server's stderr>
> My question then is this - what would cause MPI_Recv to not return,
> after the immediately preceding MPI_Probe and MPI_Get_elements
> return properly?
>
> Thanks,
> Daniel
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
Cisco Systems