On Feb 29, 2012, at 2:57 PM, Jingcha Joba wrote:
> So if I understand correctly, if a message size is smaller than it will use the MPI way (non-RDMA, 2 way communication), if its larger, then it would use the Open Fabrics, by using the ibverbs (and ofed stack) instead of using the MPI's stack?
So let's talk MPI-over-OpenFabrics-verbs specifically.
All MPI communication calls will use verbs under the covers. They may use verbs send/receive semantics in some cases, and RDMA semantics in other cases. "It depends" -- on a lot of things, actually. It's hard to come up with a good rule of thumb for when it uses one or the other; this is one of the reasons that the openib BTL code is so complex. :-)
The main points here are:
1. you can trust the openib BTL to do the Best thing possible to get the message to the other side. Regardless of whether that message is an MPI_SEND or an MPI_PUT (for example).
2. MPI_PUT does not necessarily == verbs RDMA write (and likewise, MPI_GET does not necessarily == verbs RDMA read).
> If so, could that be the reason why the MPI_Put "hangs" when sending a message more than 512KB (or may be 1MB)?
No. I'm guessing that there's some kind of bug in the MPI_PUT implementation.
> Also is there a way to know if for a particular MPI call, OF uses send/recv or RDMA exchange?
More specifically: all things being equal, you don't care which is used. You just want your message to get to the receiver/target as fast as possible. One of the main ideas of MPI is to hide those kinds of details from the user. I.e., you call MPI_SEND. A miracle occurs. The message is received on the other side.
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/