Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: [OMPI users] Why might MPI_Recv trip PSM_MQ_RECVREQS_MAX ?
From: Jonathan Wesley Stone (stonejw_at_[hidden])
Date: 2010-03-07 16:17:33


Hi,

My supercomputer has OpenMPI 1.4. I am running into a frustrating
problem with my MPI program. I am using only the following calls,
which I expect to be blocking:
MPI_Wtime
MPI_Error_string
MPI_Abort
MPI_Send
MPI_Get_count
MPI_Recv
MPI_Probe
MPI_Init
MPI_Comm_rank
MPI_Comm_size
MPI_Finalize

Somehow I am getting this error when I do a large number of sequential
communications: "c002:2.0.Exhausted 1048576 MQ irecv request
descriptors, which usually indicates a user program error or
insufficient request descriptors (PSM_MQ_RECVREQS_MAX=1048576)"

This seems counter-intuitive to me because I don't think I should be
using irecvs since I am wanting specifically to rely on the documented
blocking behavior of MPI_Recv (not MPI_Irecv, which I am not using).

My main program is quite large, however I have managed to replicate
the irritating behavior in this much smaller program, which executes a
number of MPI_Send or MPI_Recv calls in a loop. The program's default
behaviour is to run 2,000,000 iterations. When I turn it up to
20,000,000, after a short time it generates the PSM_MQ_RECVREQS_MAX
exception.

I would appreciate if anyone could advise why it might be happening in
this "test" case -- basically what is going on that causes my
presumably blocking MPI_Recv calls to "accumulate" such a large number
of "irecv request descriptors", when I expect they should be blocking
and get immediately resolved and the count should go down when the
matching MPI_Send is posted.

I appreciate your assistance. Thank you!

Jonathan Stone
Research Assistant, U. Oklahoma



  • application/octet-stream attachment: crash.c