Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] MPI_Iprobe and mca_btl_sm_component_progress
From: Terry Dontje (Terry.Dontje_at_[hidden])
Date: 2008-06-19 08:16:11


Galen, George and others that might have SM BTL interest.

In my quest of looking at MPI_Iprobe performance I found what I think is
an issue. If you have an application that is using the SM BTL and does
a small message send <=256 followed by an MPI_Iprobe the
mca_btl_sm_component function that is eventually called as a result of
the opal_progress will receive and ack message from its send and then
return. The net affect is that the real message is after the ack
message doesn't get read until a second MPI_Iprobe is made.

It seems to me that mca_btl_sm_component should read all Ack messages
from a particular fifo until it either finds a real send fragment or no
more messages on the fifo. Otherwise, we are forcing calls like
MPI_Iprobe to not return messages that are really there. I am not sure
by IB but I know that the TCP BTL does not show this issue (which
doesn't surprise me since I imagine the BTL is relying on TCP to handle
this type of protocol stuff).

Before I go munging with the code I wanted to make sure I am not
overlooking something here. One concern is if I change the code to
drain all the ack messages is that going to disrupt performance elsewhere?

--td