Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] kernel 2.6.23 vs 2.6.24 - communication/wait times
From: Eugene Loh (eugene.loh_at_[hidden])
Date: 2010-04-07 19:56:13


Oliver Geisler wrote:

>Using netpipe and comparing tcp and mpi communication I get the
>following results:
>
>TCP is much faster than MPI, approx. by factor 12
>
>
Faster? 12x? I don't understand the following:

>e.g a packet size of 4096 bytes deliveres in
>97.11 usec with NPtcp and
>15338.98 usec with NPmpi
>
>
This implies NPtcp is 160x faster than NPmpi.

>or
>packet size 262kb
>0.05268801 sec NPtcp
>0.00254560 sec NPmpi
>
>
This implies NPtcp is 20x slower than NPmpi.

>Further our benchmark started with "--mca btl tcp,self" runs with short
>communication times, even using kernel 2.6.33.1
>
>Is there a way to see what type of communication is actually selected?
>
>Can anybody imagine why shared memory leads to these problems?
>
>
Okay, so it's a shared-memory performance problem since:

1) You get better performance when you exclude sm explicitly with "--mca
btl tcp,self".
2) You get better performance when you exclude sm by distributing one
process per node (an observation you made relatively early in this thread).
3) TCP is faster than MPI (which is presumably using sm).

Can you run a pingpong test as a function of message length for two
processes in a way that demonstrates the problem? For example, if
you're comfortable with SKaMPI, just look at Pingpong_Send_Recv and
let's see what performance looks like as a function of message length.
Maybe this is a short-message-latency problem.