There are a bunch changes in the shared memory module between 1.2.9
and 1.3.1. One significant change is the introduction of the "sendi"
internal interface. I believe George Bosilca did the initial
implementation. This is just a wild guess, but maybe there is
something about sendi that increases latency when using the shared