Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RFC: optimize probe in ob1
From: Nathan Hjelm (hjelmn_at_[hidden])
Date: 2014-02-10 19:05:30

On Tue, Feb 11, 2014 at 12:29:57AM +0100, George Bosilca wrote:
> Nathan,
> While this sounds like an optimization for highly specific application behavior, it is justifiable under some usage scenarios. I have several issues with the patch. Here are the minor ones:
> 1. It does modifications that are nor necessary to the patch itself (as an example removal of the static keyword from the mca_pml_ob1_comm_proc_t class instance).

Yeah. Not really part of the RFC. I should have removed it from the
patch. That static modifier appears to be meaningless in that context.

> 2. Moving add_fragment_to_unexpected change the meaning of the code.

The location look wrong to me. A peruse receive event may be generated
multiple times the way it was before. Doesn't matter anymore though as
peruse is on its way out.

> 3. If this change get pushed in to the trunk, the only reason for the existence of last_probed disappear. Thus, the variable should disappear as well.

I agree. That variable should go away. I will remove it from my branch now.

> 4. The last part of the patch is not related to this topic and should be pushed separately.

Bah. That shouldn't have been there either. That is a separate issue I
can fix in another commit.

> Now the most major one. With this change you alter the most performance critical piece of code, by adding a non negligible number of potential cache misses (looking for the number of elements, adding/removing an element from a queue). This deserve a careful evaluation and consideration, not only for the less likely usage pattern you describe but for the more mainstream uses.

I agree that this should be reviewed carefully. A majority of the
changes are in the unexpected message path and not in the critical path
but due to the nature of icache misses it may still have an impact. I
verified there was no impact on one system using vader and a ping-pong
benchmark. I still need to verify there is no impact to message rate
both on and off node as well as verify there is no impact on other
architectures (AMD for example is very sensitive to changes outside the
critical path).

Thanks for your comments George!


  • application/pgp-signature attachment: stored