On Jun 11, 2007, at 9:27 AM, Brock Palen wrote:
> With openmpi-1.2.0
> i ran a: ompi_info --param btl tcp
> and i see reference to:
> MCA btl: parameter "btl_tcp_min_rdma_size" (current value: "131072")
> MCA btl: parameter "btl_tcp_max_rdma_size" (current value:
> Can TCP support RDMA? I thought you needed fancy hardware to get
> such support? Light on this subject is highly appreciated.
> Also if a user using ethernet, is trying to up the limit for
> 'greedy' messages that the btl_tcp_eager_limit? Is there a problem
> increasing its size? We will test it with his app of-course, but
> was wondering if there was a 'gotcha' I was going to walk into.
Hi Brock -
The "rdma" part of the TCP transport isn't real RDMA, but just which
protocol is used by the upper layers to transfer data. In the send/
receive protocol, receives always involve a copy. Using the RDMA
protocol (which is pretty simple to fake with a send/receive
interface), the TCP BTL header includes the remote address and no
copy is involved. So no, we haven't discovered some hidden interface
in TCP -- just trying to have as few special cases for various
interconnects as possible :).
Yes, increasing the btl_tcp_eager_limit is how you increase the
"greedy" message size. It's currently 64k, and the only problem with
increasing it is memory usage. With TCP, even if you need to send a
4 byte message, 64K will be used on both the sender and receiver
during transfer and these fragments are free-listed, so you can very
quickly cause Open MPI to use lots and lots of memory if the eager
limit is too big. If you start seeing segfauls, bus errors, and
failed mallocs, you might have bumped the eager limit too high and
run yourself out of memory...