> So far as I can tell, the issue is one of blocking. The OOB handshake is now async - i.e., you post a non-blocking recv at the beginning of time, and then do a non-blocking send to the other side when you want to create a connection. The question is: how do you know when that connection is ready?
As you describe, the new behavior is identical to original one. We post non-blocking (persistent) receive during initialization. Later OMPI has barrier in the flow to ensure that all processes reached the point.
On first send, we use a non-blocking oob-send to initialize the connection (QPs). The receive triggers callback that handles the connection setup. OOB / XOOB communication semantics is a fully non-blocking.
We don't really block anywhere.
We use ompi_rte_recv_buffer_nb and ompi_rte_send_buffer_nb functions only.