Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] problem in the ORTE notifier framework
From: Ralph Castain (rhc_at_[hidden])
Date: 2009-05-27 20:14:43


I think it depends upon what is being monitored. As I understand it, we
could use the peruse link to generate notifications based on the number of
times someone calls "MPI_Send", for example. I concur with George's concerns
about performance in this area and would agree that using the peruse hooks
makes some sense.

However, if one wants to generate a notification when an error occurs (e.g.,
too many IB retries) that might not be fatal, but only wants that
notification to go out every xx times that happens, then I don't think the
peruse option will work. In this scenario, though, I don't think performance
is an issue any longer - this code path would only be followed when tracking
errors, and thus can flow slower.

So I think a combination of the two approaches makes the most sense. All the
ORTE_NOTIFIER_VERBOSE method does is provide a means of enabling the second
option in a configure-it-in/out way that is fairly clean as it just mirrors
the current OPAL_OUTPUT_VERBOSE methodology. Using peruse for the first
option sounds like a reasonable approach.

HTH
Ralph

On Wed, May 27, 2009 at 12:25 PM, Jeff Squyres <jsquyres_at_[hidden]> wrote:

> Excellent points; Ralph and I chatted about this on the phone today -- we
> concur with George.
>
> Bull -- would peruse work for you? I think you mentioned before that it
> didn't seem attractive to you. I think George's point is that we already
> have lots of hooks in place in the PML -- and they're called peruse. So if
> we could use those hooks, then a) they're run-time selectable already, and
> b) there's no additional cost in performance critical/not-critical code
> paths (for the case where these stats are not being collected) because
> PERUSE has been in the code base for a long time.
>
> I think the idea is that your callbacks could be invoked by the peruse
> hooks and then they can do whatever they want -- increment counters,
> conditionally invoke the ORTE notifier system, etc.
>
>
>
>
> On May 27, 2009, at 11:34 AM, George Bosilca wrote:
>
> What is a generic threshold? And what is a counter? We have a policy
>> against such coding standards, and to be honest I would like to stick
>> to it. The reason is that the PML is a very complex piece of code, and
>> I would like to keep it as easy to understand as possible. If people
>> start adding #if/#endif all over the code, we diverging from this goal.
>>
>> The only way to make this work is to call the notifier or some other
>> framework in this "slow path" and let this other framework do it's own
>> logic to determine what and when to print. Of course the cost of this
>> is a function call plus an atomic operation (which is already not
>> cheap). It's starting to get expensive, even for a "slow path", which
>> in this particular context is just one insertion in an atomic FIFO.
>>
>> If instead of counting in number of times we try to send the fragment,
>> and switch to a time base approach, this can be solved with the PERUSE
>> calls. There is a callback when the request is created, and another
>> callback when the first fragment is pushed successfully into the
>> network. Computing the time between these two, allow a tool to figure
>> out how much time the request was waiting in some internal queues, and
>> therefore how much delay this added to the execution time.
>>
>> george.
>>
>> On May 27, 2009, at 06:59 , Ralph Castain wrote:
>>
>> > ORTE_NOTIFIER_VERBOSE(api, counter, threshold,...)
>> >
>> > #if WANT_NOTIFIER_VERBOSE
>> > opal_atomic_increment(counter);
>> > if (counter > threshold) {
>> > orte_notifier.api(.....)
>> > }
>> > #endif
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>
> --
> Jeff Squyres
> Cisco Systems
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>