> > Not to muddy the point, but if there's enough ambiguity in the Standard
> > for people to ignore the progress rule, then I think (hope) there's enough
> > ambiguity for people to ignore the sender throttling issue too ;)
>
> I understand your position, and I used to agree until I was forced to
> change my mind by naive users :-)
Right. That's what I meant by:
"Most of the vendors aren't allowed to have this perspective....".
>
> Poorly written MPI codes won't likely segfault or deadlock because the
> progress rule was ignored. However, users will proudly tell you that you
> have a memory leak if you don't limit the size of the unexpected queue
> and their codes with no flow control blow up.
Yep. I don't lose money when I tell these people to go fix their code. I like
to think that I actually get paid to tell these people to go fix their code....
>
> You don't have to make it very efficient (per-sender credits
> definitively does not scale), but you need to have a way to stall/slow
> the sender when the unexpected queue gets too big. That's quite easy to
> do without affecting the common case.
Not on my network. I don't have the nice situation that the Standard refers
to where one producer is overwhelming the consumer. For a reasonable number
of endpoints and a known offending sender, it's pretty straightforward to
do a user-level credit-based flow control.
I'm looking at a network where the number of endpoints is large enough that
everybody can't have a credit to start with, and the "offender" isn't any
single process, but rather a combination of processes doing N-to-1 where N
is sufficiently large. I can't just tell one process to slow down. I have
to tell them all to slow down and do it quickly...
-Ron
|