Hi,
I had the following setup:
Rank 0 process on node 1 wants to send an array of particular size to Rank 1 process on same node.
1. What are the optimisations that can be done/invoked while running mpirun to perform this memory to memory transfer efficiently?
2. Is there any performance gain if 2 processes that are exchanging data arrays are kept on the same node rather than on different nodes connected by infiniband?
Awaiting a reply,
-Chev