After some digging Terry and I discovered the problem with r26626. To perform an rdma transaction pmls used to explicitly promote the seg_addr from prepare_src/dst to 64-bits before sending it over the wire. The other end would then (inconsistently) use the lval to perform the get/put. Segments are now opaque objects so the pmls simply memcpy the segments into the rdma header (without promoting seg_addr). So, right now we have a mixture of lvals and pvals in the put and get paths which will not work in two cases: 32-bit bit, and mixed 32/64-bit environments.
I can think of a few ways to fix this:
- Require the pmls to explicitly promote seg_addr to 64-bits after the memcpy. This is a band aid fix but I can implement/commit it very quickly (this will work fine until a more permanent solution is found).
- Require prepare_src/dst to return segments with 64-bit addresses for all rdma fragments (0 == reserve). This is relatively simple for most btls but a little more complicated for openib. The openib btl may pack data for a get/put into a send segment. The obvious way to handle this case is to set the lval in prepare_src and restore the pval when the send fragment is returned.
- Change the btl interface in a way that allows the btl to prepare segments specifically to be sent to another machine. This is a bit more complicated and would require lots of discussion and an RFC.
I am open to suggestions.