Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RFC: [slightly] Optimize Fortran MPI_SEND / MPI_RECV
From: N.M. Maclaren (nmm1_at_[hidden])
Date: 2009-02-08 13:52:52


On Feb 7 2009, Jeff Squyres wrote:
>On Feb 7, 2009, at 12:23 PM, Brian W. Barrett wrote:
>
>> That is significantly higher than I would have expected for a single
>> function call. When I did all the component tests a couple years
>> ago, a function call into a shared library was about 5ns on an Intel
>> Xeon (pre-Core 2 design) and about 2.5 on an AMD Opteron.
>
>Good; I'm not crazy for thinking that this is a little too obvious --
>it smells like I did something wrong. Could someone eyeball these
>files and see if I missed anything obvious:

At the risk of telling grandmothers how to suck eggs, have you tried
with with different compilers, different systems and/or adding a few
irrelevant (but not optimisable-out) declarations or statements?

That sort of phenomenon is exactly what happens when you trip over a
cache problem - e.g. running out of cache associativity. It can also
occur because of pipeline drain (e.g. branch misprediction) problems.
Neither of those would be found by eyeballing the code - you would at
least have to eyeball the assembler.

Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email: nmm1_at_[hidden]
Tel.: +44 1223 334761 Fax: +44 1223 334679