Galen, George and others that might have SM BTL interest.
In my quest of looking at MPI_Iprobe performance I found what I think is
an issue. If you have an application that is using the SM BTL and does
a small message send <=256 followed by an MPI_Iprobe the
mca_btl_sm_component function that is eventually called as a result of
the opal_progress will receive and ack message from its send and then
return. The net affect is that the real message is after the ack
message doesn't get read until a second MPI_Iprobe is made.
It seems to me that mca_btl_sm_component should read all Ack messages
from a particular fifo until it either finds a real send fragment or no
more messages on the fifo. Otherwise, we are forcing calls like
MPI_Iprobe to not return messages that are really there. I am not sure
by IB but I know that the TCP BTL does not show this issue (which
doesn't surprise me since I imagine the BTL is relying on TCP to handle
this type of protocol stuff).
Before I go munging with the code I wanted to make sure I am not
overlooking something here. One concern is if I change the code to
drain all the ack messages is that going to disrupt performance elsewhere?