On Aug 13, 2007, at 11:28 AM, George Bosilca wrote:
>> Such a scheme is certainly possible, but I see even less use for it
>> than use cases for the existing microbenchmarks. Specifically,
>> header caching *can* happen in real applications (i.e., repeatedly
>> send short messages with the same MPI signature), but repeatedly
>> sending to the same peer with exactly the same signature *and*
>> exactly the same "long-enough" data (i.e., more than a small number
>> of ints that an app could use for its own message data caching) is
>> indicative of a poorly-written MPI application IMHO.
> If you look at the message size distribution for most of the HPC
> applications (at least one that get investigated in the papers) you
> will see that very small messages are only an non-significant
> percentage of messages.
This would be different than what Patrick has told us about Myricom's
analysis of real world MPI applications and one of the strong points
of QLogic's HCAs (that it's all about short message latency /
injection rate; bandwidth issues are [at least currently]
> As this "optimization" only address these
> kind of messages, I doubt there is any real benefit from applications
> point of view (obviously there will be few exceptions as usual). The
> header caching only make sense for very small messages (MVAPICH only
> implement header caching for messages up to 155 bytes [that's less
> than 20 doubles] if I remember well), which make it a real benchmark
I don't have enough data to say. But I'm sure there are at least
*some* applications out there that would benefit from it. Probably
somewhere between 1 and 99%. ;-)
But just to reiterate/be clear: my goal here is to reduce latency.
If header caching is not the way to go, then so be it.