Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: [OMPI users] Why might MPI_Recv trip PSM_MQ_RECVREQS_MAX ?
From: Jonathan Wesley Stone (stonejw_at_[hidden])
Date: 2010-03-07 16:17:33


Hi,

My supercomputer has OpenMPI 1.4. I am running into a frustrating
problem with my MPI program. I am using only the following calls,
which I expect to be blocking:
MPI_Wtime
MPI_Error_string
MPI_Abort
MPI_Send
MPI_Get_count
MPI_Recv
MPI_Probe
MPI_Init
MPI_Comm_rank
MPI_Comm_size
MPI_Finalize

Somehow I am getting this error when I do a large number of sequential
communications: "c002:2.0.Exhausted 1048576 MQ irecv request
descriptors, which usually indicates a user program error or
insufficient request descriptors (PSM_MQ_RECVREQS_MAX=1048576)"

This seems counter-intuitive to me because I don't think I should be
using irecvs since I am wanting specifically to rely on the documented
blocking behavior of MPI_Recv (not MPI_Irecv, which I am not using).

My main program is quite large, however I have managed to replicate
the irritating behavior in this much smaller program, which executes a
number of MPI_Send or MPI_Recv calls in a loop. The program's default
behaviour is to run 2,000,000 iterations. When I turn it up to
20,000,000, after a short time it generates the PSM_MQ_RECVREQS_MAX
exception.

I would appreciate if anyone could advise why it might be happening in
this "test" case -- basically what is going on that causes my
presumably blocking MPI_Recv calls to "accumulate" such a large number
of "irecv request descriptors", when I expect they should be blocking
and get immediately resolved and the count should go down when the
matching MPI_Send is posted.

I appreciate your assistance. Thank you!

Jonathan Stone
Research Assistant, U. Oklahoma



  • application/octet-stream attachment: crash.c