George & other list members,
I think I may have a race condition in this example that is masked by
the print_matrix statement.
For example, lets say rank one has a large sleep before reaching the
local transpose, will the other ranks have completed the Alltoall and
when rank one reaches the local transpose it is altering the data that
the other processors sent it?