Open MPI logo

MTT Devel Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all MTT Devel mailing list

Subject: Re: [MTT devel] duplicate results
From: Eugene Loh (eugene.loh_at_[hidden])
Date: 2012-02-24 01:31:56


This is recycled e-mail from about 1.5 months ago. I observed this
problem again. That is, if one queries the MTT database, certain
results are reported twice.

The date range in question is 2012/02/23 from about 08:48 to 09:02. The
submitting system is once again ^burl-ct-v20z-2$. The problem is once
again with v1.5 testing and with intel-64 tests. On the client side,
the log seems to indicate that each result is submitted once. If I
query the database, however, it shows a number of results reported
twice. These incidents are consecutive -- that is, the behavior starts
at some time and ends at another.

Even if no one has time to figure this out, I figured I'd report this
for the record books.

On 1/9/2012 11:13 AM, Josh Hursey wrote:
> Well if the debug results seem correct then there must be some bug in
> the submission script. :/ It is a pretty old piece of code, so it is
> possible that something is going awry in there.
>
> Let us know if you notice further problems like this. I won't have
> time to look into them in the near term, but I'll try to put in on the
> short list to get to when I get free cycles. If you happen to come
> across a repeater scenario (not likely since this seems like something
> difficult to reproduce) that would help the debugging effort.
>
> On Fri, Jan 6, 2012 at 2:07 PM, Eugene Loh <eugene.loh_at_[hidden]
> <mailto:eugene.loh_at_[hidden]>> wrote:
>
> On 01/06/12 08:52, Josh Hursey wrote:
>> Weird. I don't know what is going on here, unless the client is
>> somehow submitting some of the results too many times. One thing
>> to check is the debug output file that the MTT client is
>> submitting to the server. Check that for duplicates.
> Sorry, I don't understand where to check. I do know that if I
> look at the output from the MTT client, I see a bunch of messages
> like this:
>
> >> Reported to MTTDatabase client: 1 successful submit, 0 failed
> submits (total of 6 results)
>
> If I add up those numbers of results submitted, the totals match
> what I would expect. So, there is some indication that the number
> of client submissions is right.
>
>> That will help determine whether this is a server side problem or
>> client side problem. I have not noticed this behavior on the
>> server before,
> I haven't either, but I only just started looking more closely at
> results. Mostly, in any case, things look fine.
>
>> but might be something with the submit.php script - just a guess
>> though at this point.
>>
>> Unfortunately I have zero time to spend on MTT for a few weeks at
>> least. :/
>>
>> On Thu, Jan 5, 2012 at 8:11 PM, Eugene Loh <eugene.loh_at_[hidden]
>> <mailto:eugene.loh_at_[hidden]>> wrote:
>>
>> I go to MTT and I choose:
>>
>> Test run
>> Date range: 2012-01-05 05 <tel:2012-01-05%2005>:00:00 -
>> 2012-01-05 12 <tel:2012-01-05%2012>:00:00
>> Org: Oracle
>> Platform name: $burl-ct-v20z-2$
>> Suite: intel-64
>>
>> and I get:
>>
>> 1 oracle burl-ct-v20z-2 i86pc SunOS ompi-nightly-trunk
>> 1.7a1r25692 intel-64 4 870 0 86 0 0
>> 2 oracle burl-ct-v20z-2 i86pc SunOS ompi-nightly-v1.5
>> 1.5.5rc2r25683 intel-64 4 915 0 92 0 0
>>
>> I get more tests (passing and skipped) with v1.5 than I do
>> with the trunk run. I have lots of ways of judging what the
>> numbers should be. The "trunk" numbers are right. The "v1.5"
>> numbers are too high; they should be the same as the trunk
>> numbers.
>>
>> I can see the explanation by clicking on "Detail" and looking
>> at individual runs. (To get time stamps, I add a " | by
>> minute" qualifier before clicking on "Detail". Maybe there's
>> a more proper way of getting time stamps, but that seems to
>> work for me.) Starting with record 890 and ending with 991,
>> records are repeated. That is, 890 and 891 have identical
>> command lines, time stamps, output, etc. One of them is a
>> duplicate. Same with 892 and 893, then for 894 and 895, then
>> 896 and 897, and so on. So, for about a one-hour period, the
>> records sent in by this test run appear duplicated when I
>> submit queries to the database. These 51 duplicates are the
>> 45 extra passes and 6 extra skips seen in the results above.
>>
>> Can someone figure out what's going wrong here? Clearly, I'd
>> like to be able to rely on query results.
>>