On Jul 18, 2013, at 17:12 , "Iliev, Hristo" <Iliev@rz.rwth-aachen.de> wrote:

Could someone, who is more familiar with the architecture of the sm BTL, comment on the technical feasibility of the following: is it possible to easily extend the BTL (i.e. without having to rewrite it completely from scratch) so as to be able to perform transfers using both KNEM (or other kernel-assisted copying mechanism) for messages over a given size and the normal user-space mechanism for smaller messages with the switch-over point being a user-tunable parameter?

This is already what the SM BTL does. When support for kernel-assisted mechanisms is enabled everything under the eager size is going over "traditional" shared memory (double copy and so on), while larger messages use the single-copy mechanism.


From what Iíve seen, both implementations have something in common, e.g. both use FIFOs to communicate controlling information.
The motivation behind this are our efforts to become greener by extracting the best possible out of the box performance on our systems without having to profile each and every user application that runs on them. Weíve already determined that activating KNEM really benefits some collective operations on big shared-memory systems, but the increased latency significantly slows down small message transfers, which also hits the pipelined implementations.
smís code doesnít seem to be very complex but still Iíve decided to ask first before diving any deeper.
Kind regards,
Hristo Iliev, PhD Ė High Performance Computing Team
RWTH Aachen University, Center for Computing and Communication
Rechen- und Kommunikationszentrum der RWTH Aachen
Seffenter Weg 23, D 52074 Aachen (Germany)
devel mailing list