On Mar 18, 2008, at 10:32 AM, George Bosilca wrote:
> Jeff hinted the real problem in his email. Even if the program use
> the correct MPI functions, it is not 100% correct.
I think we disagree here -- the sample program is correct according to
the MPI spec. It's an implementation artifact that makes it deadlock.
The upcoming v1.3 series doesn't suffer from this issue; we revamped
our transport system to distinguish between early and normal
completions. The pml_ob1_use_eager_completion MCA param was added to
v1.2.6 to allow correct MPI apps to avoid this optimization -- a
proper fix is coming in the v1.3 series.
> It might pass in some situations, but can lead to fake "deadlocks"
> in others. The problem come from the flow control. If the messages
> are small (which is the case in the test example), Open MPI will
> send them eagerly. Without a flow control, these messages will be
> buffered by the receiver, which will exhaust the memory on the
> receiver. Once this happens, some of the messages may get dropped,
> but the most visible result, is that the progress will happens very
> (VERY) slowly.
Your text implies that we can actually *drop* (and retransmit)
messages in the sm btl. That doesn't sound right to me -- is that
what you meant?