I am new user of Open MPI, I've used MPICH before.
I've tried on the user list but they couldn't help me.

There is performance bug with the following scenario:

proc_B:  MPI_Isend(...,proc_A,..,&request)

proc_A: MPI_Recv(...,proc_B);

For message size 8MB,  proc_B calls MPI_Test 88 times. It means that point to point communication costs 88 seconds.
Btw, bandwidth isn't the problem (interconnection network: InfiniBand)

Obviously, there is the problem with progress of the asynchronous messages.  In order to overlap communication and computation I don't want to use MPI_Wait.  Probably, the message is being decomposed into chucks and the size of chuck is probably defined by environment variable.

 How can I advance the message more aggressively or can I control size of chunk?
Thank you very much