Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Miguel Figueiredo Mascarenhas Sousa Filipe (miguel.filipe_at_[hidden])
Date: 2006-11-08 07:25:20


Hi,

On 11/8/06, Greg Lindahl <greg.lindahl_at_[hidden]> wrote:
> On Tue, Nov 07, 2006 at 05:02:54PM +0000, Miguel Figueiredo Mascarenhas Sousa Filipe wrote:
>
> > if your aplication is on one given node, sharing data is better than
> > copying data.
>
> Unless sharing data repeatedly leads you to false sharing and a loss
> in performance.

what does that mean.. I did not understand that.

>
> > the MPI model assumes you don't have a "shared memory" system..
> > therefore it is "message passing" oriented, and not designed to
> > perform optimally on shared memory systems (like SMPs, or numa-CCs).
>
> For many programs with both MPI and shared memory implementations, the
> MPI version runs faster on SMPs and numa-CCs. Why? See the previous
> paragraph...

Of course it does..its faster to copy data in main memory than it is
to do it thought any kind of network interface. You can optimize you
message passing implementation to a couple of memory to memory copies
when ranks are on the same node. In the worst case, even if using
local IP addresses to communicate between peers/ranks (in the same
node), the operating system doesn't even touch the interface.. it
will just copy data from a tcp sender buffer to a tcp receiver
buffer.. in the end - that's always faster than going through a
phisical network link.

But you still have a message passing api that is doing memory to
memory copies.. its a worse framework to do memory copies than a api
designed just for that.
One could argue that MPI is more than a message passing api, since it
provides also APIs to apply operators to the data..

But, for instance.. try to benchmark real applications with a MPI and
posix threads implementations in the same numa-cc or big SMP machine..
my bet is that posix threads implementation is going to be faster..
There are always exceptions.. like having a very well designed MPI
application, but a terrible posix threads one.. or design that's just
not that adaptable to a posix threads programming model (or a MPI
model).

-- 
Miguel Sousa Filipe