Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: [OMPI devel] BTL preferred_protocol , large message
From: Damien Guinier (damien.guinier_at_[hidden])
Date: 2011-03-08 12:12:26

Hi Jeff

I'm working on large message exchange optimization. My optimization
consists in "choosing
the best protocol for each large message".
In fact,
- for each device, the way to chose the best protocol is different.
- the faster protocol for a given device depends on that device hardware
and on the message

So the device/BTL itself is the best place to dynamically select the
fastest protocol.

Presently, for large messages, the protocol selection is only based on
device capabilities.
My optimization consists in asking the device/BTL for a "preferred
protocol" and
then make a choice based on :
         - the device capabilities and the BTL's recommendation.

Technical view:
The optimization is located in mca_pml_ob1_send_request_start_btl(),
after the device/btl selection.
In the large message section, I call a new function :
    mca_pml_ob1_preferred_protocol() => mca_bml_base_preferred_protocol()
This one will try to launch
So, selecting a protocol before a large message in not in the critical
It is the BTL's responsibility to define this function to select a
preferred protocol.

If this function is not defined, nothing changes in the code path
To do this optimization , I had to add an interface to the btl module
structure in "btl.h", this is the drawback.

I have already used this feature to optimize the "shared memory" 
device/BTL. I use the "preferred_protocol" feature to enable/disable
KNEM according to intra/inter socket communication. This optimization 
increases a "IMB pingping benchmark" bandwidth by ~36%.
The next step is now to use the "preferred protocol" feature with openib 
( with many IB cards)
Attached 2 patches:
1) BTL_preferred.patch:
    introduces the new preferred protocol interface
2) SM_KNEM_intra_socket.patch:
    defines the preferred protocol for the sm btl
    Note: Since the "ess" framework can't give us the "socket locality
          information", I used hitopo that has been proposed in an RFC
          some times ago: