Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

From: Brian Barrett (bbarrett_at_[hidden])
Date: 2007-08-13 11:43:48


On Aug 13, 2007, at 9:33 AM, George Bosilca wrote:

> On Aug 13, 2007, at 11:28 AM, Pavel Shamis (Pasha) wrote:
>
>> Jeff Squyres wrote:
>>> I guess reading the graph that Pasha sent is difficult; Pasha -- can
>>> you send the actual numbers?
>>>
>> Ok here is the numbers on my machines:
>> 0 bytes
>> mvapich with header caching: 1.56
>> mvapich without header caching: 1.79
>> ompi 1.2: 1.59
>>
>> So on zero bytes ompi not so bad. Also we can see that header caching
>> decrease the mvapich latency on 0.23
>>
>> 1 bytes
>> mvapich with header caching: 1.58
>> mvapich without header caching: 1.83
>> ompi 1.2: 1.73
>>
>> And here ompi make some latency jump.
>>
>> In mvapich the header caching decrease the header size from
>> 56bytes to
>> 12bytes.
>> What is the header size (pml + btl) in ompi ?
>
> The match header size is 16 bytes, so it looks like ours is already
> optimized ...

Pasha -- Is your build of Open MPI built with --disable-
heterogeneous? If not, our headers all grow slightly to support
heterogeneous operations. For the heterogeneous case, a 1 byte
message includes:

   16 bytes for the match header
   4 bytes for the Open IB header
   1 byte for the payload
  ----
   21 bytes total

If you are using eager RDMA, there's an extra 4 bytes for the RDMA
length in the footer. Without heterogeneous support, 2 bytes get
knocked off the size of the match header, so the whole thing will be
19 bytes (+ 4 for the eager RDMA footer).

There are also considerably more ifs in the code if heterogeneous is
used, especially on x86 machines.

Brian