We're currently working with romio and we hit a problem when exchanging
data with hindexed types with the openib btl.
The attached reproducer (adapted from romio) is working fine on tcp,
blocks on openib when using 1 port but works if we use 2 ports (!). I
tested it against the trunk and the 1.3.3 release with the same
The basic idea is : processes 0..3 send contiguous data to process 0. 0
receives these buffers with an hindexed datatype which scatters data at
Receiving in a contiguous manner works, but receiving with an hindexed
datatype makes the remote sends block. Yes, the remote send, not the
receive. The receive is working fine and data is correctly scattered on
the buffer, but the senders on the other node are stuck in the Wait().
I tried not using MPI_BOTTOM, which changed nothing. It seems that the
problem only occurs when STRIPE*NB (the size of the send) is higher than
12k -namely the RDMA threshold- but I didn't manage to remove the
deadlock by increasing the RDMA threshold.
I've tried to do some debugging, but I'm a bit lost on where the
non-contiguous types are handled and how they affect btl communication.
So, if anyone has a clue on where I should look at, I'm interested !