I continue to have a problem where 2 processes are sending to the same process and one of the sending processes hangs for 150 to 550 ms in the call to MPI_Send.
Each process runs on a different node and the receiving process has posted an MPI_Irecv 17 ms before the hanging send.
The posted receives are for 172K buffers and the sending processes are sending 81K size messages.
I have set mpi_leave_pinned to 1 and have increased the btl_openib_receive_queues to ...:S,65536,512,256,64
How do I trace the various phases of message passing to diagnose where the send is hanging up?