Can you provide an example of a situation in which these semantically
redundant barriers help?
I may be missing something but my statement for the text book would be
"If adding a barrier to your MPI program makes it run faster, there is
almost certainly a flaw in it that is better solved another way."
The only exception I can think of is some sort of one direction data
dependancy with messages small enough to go eagerly. A program that calls
MPI_Reduce with a small message and the same root every iteration and
calls no other collective would be an example.
In that case, fast tasks at leaf positions would run free and a slow task
near the root could pile up early arrivals and end up with some additional
slowing. Unless it was driven into paging I cannot imagine the slowdown
would be significant though.
Even that should not be a problem for an MPI implementation that backs
off on eager send before it floods early arrival buffers.
Dick Treumann - MPI Team
IBM Systems & Technology Group
Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846 Fax (845) 433-8363