On Feb 7 2009, Jeff Squyres wrote:
>On Feb 7, 2009, at 12:23 PM, Brian W. Barrett wrote:
>> That is significantly higher than I would have expected for a single
>> function call. When I did all the component tests a couple years
>> ago, a function call into a shared library was about 5ns on an Intel
>> Xeon (pre-Core 2 design) and about 2.5 on an AMD Opteron.
>Good; I'm not crazy for thinking that this is a little too obvious --
>it smells like I did something wrong. Could someone eyeball these
>files and see if I missed anything obvious:
At the risk of telling grandmothers how to suck eggs, have you tried
with with different compilers, different systems and/or adding a few
irrelevant (but not optimisable-out) declarations or statements?
That sort of phenomenon is exactly what happens when you trip over a
cache problem - e.g. running out of cache associativity. It can also
occur because of pipeline drain (e.g. branch misprediction) problems.
Neither of those would be found by eyeballing the code - you would at
least have to eyeball the assembler.
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Tel.: +44 1223 334761 Fax: +44 1223 334679