Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Miguel Figueiredo Mascarenhas Sousa Filipe (miguel.filipe_at_[hidden])
Date: 2006-11-07 12:02:54


Hi,

On 11/7/06, Chevchenkovic Chevchenkovic <chevchenkovic_at_[hidden]> wrote:
> Hi,
> I had the following setup:
> Rank 0 process on node 1 wants to send an array of particular size to Rank
> 1 process on same node.
> 1. What are the optimisations that can be done/invoked while running mpirun
> to perform this memory to memory transfer efficiently?
> 2. Is there any performance gain if 2 processes that are exchanging data
> arrays are kept on the same node rather than on different nodes connected by
> infiniband?

if your aplication is on one given node, sharing data is better than
copying data.
You can do this with unix shared memory api, or with posix threads api.
If aplications share the same address space, and if copy is necessary,
memcpy() is probably the faster way (and ensuring that data is aligned
in memory).
However, this by definition does not work on multi-computer
aplications/systems..

If you can have:

1 aplication per node, several threads per node.
consider using MPI only between aplications, and setup your MPI
framework to launch one aplication per node.

program your aplication to use #threads per rank (node), and use posix
threading model for parallel execution in each node (for instance,
where #threads == NCPUS) , and use MPI for comunicating between nodes.

the MPI model assumes you don't have a "shared memory" system..
therefore it is "message passing" oriented, and not designed to
perform optimally on shared memory systems (like SMPs, or numa-CCs).

best regards,

-- 
Miguel Sousa Filipe