Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] problem in the ORTE notifier framework
From: Ralph Castain (rhc_at_[hidden])
Date: 2009-06-08 07:39:58


I believe the concern here was that we aren't entirely sure just where
you plan to do this. If we are talking about reporting errors, then
there is less concern about adding cycles. For example, we already
check to see if the IB driver has exceeded the limit on retries -
adding more logic to the code that executes when that test is positive
is of little concern.

However, if we are talking about adding warnings that are not in the
error paths, then there is concern because that code will execute
every time, even when there isn't a problem. There is no issue with
using likely() directives, but I'm not sure there is general agreement
with your analysis regarding the potential impact of adding such code,
and the belief that it only adds one cycle doesn't appear to be
supported by our experience to date. Hence the cautions from other
developers.

Regardless, it has been our general policy to add this kind of
capability on a "configure-in" basis so that those who do not want it
are not impacted by it. My proposed method would allow for that
policy. Whether you use that approach, or devise your own, I do
believe the "configure-in" policy really needs to be used for this
capability.

Working on a tmp branch will give developers a chance to evaluate the
overall impact and help people in deciding whether or not to enable
this capability. I suspect (based on prior similar proposals) that
many will choose -not- to enable it (e.g., research clusters in
universities), while some (e.g., large production clusters) may well
do so, depending on exactly what you are reporting.

HTH
Ralph

On Jun 8, 2009, at 4:57 AM, Sylvain Jeaugey wrote:

> Ralph,
>
> Sorry for answering on this old thread, but it seems that my answer
> was blocked in the "postponed" folder.
>
> About the if-then, I thought it was 1 cycle. I mean, if you don't
> break the pipeline, i.e. use likely() or builtin_expect() or
> something like that to be sure that the compiler will generate
> assembly in the right way, it shouldn't be more than 1 cycle,
> perhaps less on some architectures like Itanium [however, my multi-
> architecture view is somewhat limited to x86 and ia64, so I may be
> wrong].
>
> So, in these if-then cases where we know which branch is the more
> likely to be used, I don't think that 1 CPU cycle is really a
> problem, especially if we are already in a slow code path.
>
> Is there a multi-compiler,multi-arch,multi-os reason not to use
> likely() directives ?
>
> Sylvain
>
> On Wed, 27 May 2009, Ralph Castain wrote:
>
>> While that is a good way of minimizing the impact of the counter,
>> you still have to do an "if-then" to check if the counter
>> exceeds the threshold. This "if-then" also has to get executed
>> every time, and generally consumes more than a few cycles.
>> To be clear: it isn't the output that is the concern. The output
>> only occurs as an exception case, essentially equivalent
>> to dealing with an error, so it can be "slow". The concern is with
>> the impact of testing to see if the output needs to be
>> generated as this testing occurs every time we transit the code.
>> I think Jeff and I are probably closer to agreement on design than
>> it might seem, and may be close to what you might also
>> have had in mind. Basically, I was thinking of a macro like this:
>> ORTE_NOTIFIER_VERBOSE(api, counter, threshold,...)
>> #if WANT_NOTIFIER_VERBOSE
>> opal_atomic_increment(counter);
>> if (counter > threshold) {
>> orte_notifier.api(.....)
>> }
>> #endif
>> You would set the specific thresholds for each situation via MCA
>> params, so this could be tuned to fit specific needs.
>> Those who don't want the penalty can just build normally - those
>> who want this level of information can enable it.
>> We can then see just how much penalty is involved in real world
>> situations. My guess is that it won't be that big, but it's
>> hard to know without seeing how frequently we actually insert this
>> code.
>> Hope that makes sense
>> Ralph
>> On Wed, May 27, 2009 at 1:25 AM, Sylvain Jeaugey <sylvain.jeaugey_at_[hidden]
>> > wrote:
>> About performance, I may miss something, but our first goal
>> was to track already slow pathes.
>>
>> We imagined that it could be possible to add at the beginning
>> (or end) of this "bad path" just one line that
>> would basically do an atomic inc. So, in terms of CPU cycles,
>> something like 1 for the inc and maybe 1 jump
>> before. Are a couple of cycles really an issue in slow pathes
>> (which take at least hundreds of cycles), or do
>> you fear out-of-cache memory accesses - or something else ?
>>
>> As for outputs, they indeed are slow (and can slow down
>> considerably an application if not synchronized), but
>> aggregation on the head node should solve our problems. And if
>> not, we can also disable outputs at runtime.
>>
>> So, in my opinion, no application should notice a difference
>> (unless you tune the framework to output every
>> warning).
>>
>> Sylvain
>> On Tue, 26 May 2009, Jeff Squyres wrote:
>>
>> Nadia --
>>
>> Sorry I didn't get to jump in on the other thread earlier.
>>
>> We have made considerable changes to the notifier framework in
>> a branch to better support "SOS"
>> functionality:
>>
>> https://www.open-mpi.org/hg/auth/hgwebdir.cgi/jsquyres/opal-sos
>>
>> Cisco and Indiana U. have been working on this branch for a
>> while. A description of the SOS stuff is
>> here:
>>
>> https://svn.open-mpi.org/trac/ompi/wiki/ErrorMessages
>>
>> As for setting up an external web server with hg, don't bother
>> -- just get an account at bitbucket.org.
>> They're free and allow you to host hg repositories there.
>> I've used bitbucket to collaborate on code
>> before it hits OMPI's SVN trunk with both internal and
>> external OMPI developers.
>>
>> We can certainly move the opal-sos repo to bitbucket (or
>> branch again off opal-sos to bitbucket --
>> whatever makes more sense) to facilitate collaborating with you.
>>
>> Back on topic...
>>
>> I'd actually suggest a combination of what has been discussed
>> in the other thread. The notifier can be
>> the mechanism that actually sends the output message, but it
>> doesn't have to be the mechanism that tracks
>> the stats and decides when to output a message. That can be
>> separate logic, and therefore be more
>> fine-grained (and potentially even specific to the MPI layer).
>>
>> The Big Question will how to do this with zero performance
>> impact when it is not being used. This has
>> always been the difficult issue when trying to implement any
>> kind of monitoring inside the core OMPI
>> performance-sensitive paths. Even adding individual branches
>> has met with resistance (in
>> performance-critical code paths)...
>>
>> On May 26, 2009, at 10:59 AM, Nadia Derbey wrote:
>>
>> Hi,
>>
>> While having a look at the notifier framework under
>> orte, I noticed that
>> the way it is written, the init routine for the selected
>> module cannot
>> be called.
>>
>> Attached is a small patch that fixes this issue.
>>
>> Regards,
>> Nadia
>>
>> <orte_notifier_fix_select.patch><ATT14046023.txt>
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel