Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RFC: optimize probe in ob1
From: George Bosilca (bosilca_at_[hidden])
Date: 2014-02-10 19:43:37

On Feb 11, 2014, at 01:05 , Nathan Hjelm <hjelmn_at_[hidden]> wrote:

> On Tue, Feb 11, 2014 at 12:29:57AM +0100, George Bosilca wrote:
>> Nathan,
>> While this sounds like an optimization for highly specific application behavior, it is justifiable under some usage scenarios. I have several issues with the patch. Here are the minor ones:
>> 1. It does modifications that are nor necessary to the patch itself (as an example removal of the static keyword from the mca_pml_ob1_comm_proc_t class instance).
> Yeah. Not really part of the RFC. I should have removed it from the
> patch. That static modifier appears to be meaningless in that context.

The class is only usable in the context of a single .c file. As a code protection it makes perfect sense to me.

>> 2. Moving add_fragment_to_unexpected change the meaning of the code.
> The location look wrong to me. A peruse receive event may be generated
> multiple times the way it was before. Doesn't matter anymore though as
> peruse is on its way out.

It’s not yet, and I did not notice an RFC about. The event I was referring to is only generated when the message is first noticed. In the particular instance affected by your patch it has been delayed until the communicator is created locally, but it still have to be generated once.

>> 3. If this change get pushed in to the trunk, the only reason for the existence of last_probed disappear. Thus, the variable should disappear as well.
> I agree. That variable should go away. I will remove it from my branch now.
>> 4. The last part of the patch is not related to this topic and should be pushed separately.
> Bah. That shouldn't have been there either. That is a separate issue I
> can fix in another commit.
>> Now the most major one. With this change you alter the most performance critical piece of code, by adding a non negligible number of potential cache misses (looking for the number of elements, adding/removing an element from a queue). This deserve a careful evaluation and consideration, not only for the less likely usage pattern you describe but for the more mainstream uses.
> I agree that this should be reviewed carefully. A majority of the
> changes are in the unexpected message path and not in the critical path
> but due to the nature of cache misses it may still have an impact.

The size check and the removal from the list is still in the critical path. At some point we were down to few hundreds of nano-sec, enough to get bugged by one extra memory reference.


> I verified there was no impact on one system using vader and a ping-pong
> benchmark. I still need to verify there is no impact to message rate
> both on and off node as well as verify there is no impact on other
> architectures (AMD for example is very sensitive to changes outside the
> critical path).
> Thanks for your comments George!
> -Nathan
> _______________________________________________
> devel mailing list
> devel_at_[hidden]