Open MPI logo

MTT Devel Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all MTT Devel mailing list

From: Josh Hursey (jjhursey_at_[hidden])
Date: 2007-08-31 12:47:26


I was looking at the data from Monday Aug 27, 8 am to Tuesday Aug 28,
Noonish when this problem was occuring, and the data is mostly
invalid. We have test_builds pointing at the wrong test_suites. Since
this brings all of this data inso suspicion I'm going through and
flaging them all as 'trial'.

If you don't have any conflict, then I'd like to remove this data
alltogether from the database so the normalization tables can be
cleaned up.

Any objections to removing the set of data in the time range Monday
Aug 27, 8 am to Tuesday Aug 28, Noonish? it's about 8,000 test_runs
since most of the test runs were getting rejected during that time
period we are not losing any good data.

-- Josh

On Aug 28, 2007, at 10:27 AM, Josh Hursey wrote:

> Short Version:
> --------------
> I just finished the fix, and the submit script is back up and running.
>
> This was a bug that arose in testing, but somehow did not get
> propagated to the production database.
>
> Long Version:
> -------------
> The new databases uses partition tables to archive test results. As
> part of this there are some complex rules to mask the partition table
> complexity from the users of the db. There was a bug in the insert
> rule in which the 'id' of the submitted result (mpi_install,
> test_build, and test_run) was a different value than expected since
> the 'id' was not translated properly to the partition table setup.
>
> The fix was to drop all rules and replace them with the correct
> versions. The submit errors you saw below were caused by integrity
> checks in the submit script that keep data from being submitted that
> do not have a proper lineage (e.g., you cannot submit a test_run
> without having submitted a test_build and an mpi_install result). The
> bug caused the client and the server to become confused on what the
> proper 'id' should be and when the submit script attempted to 'guess'
> the correct run it was unsuccessful and errored out.
>
> So sorry this bug lived this long, but it should be fixed now.
>
> -- Josh
>
> On Aug 28, 2007, at 10:16 AM, Jeff Squyres wrote:
>
>> Josh found the problem and is in the process of fixing it. DB
>> submits are currently disabled while Josh is working on the fix.
>> More specific details coming soon.
>>
>> Unfortunately, it looks like all data from last night will be
>> junk. :-( You might as well kill any MTT scripts that are still
>> running from last night.
>>
>>
>> On Aug 28, 2007, at 9:14 AM, Jeff Squyres wrote:
>>
>>> Josh and I are investigating -- the total runs in the db in the
>>> summary report from this morning is far too low. :-(
>>>
>>>
>>> On Aug 28, 2007, at 9:13 AM, Tim Prins wrote:
>>>
>>>> It installed and the tests built and made it into the database:
>>>> http://www.open-mpi.org/mtt/reporter.php?do_redir=293
>>>>
>>>> Tim
>>>>
>>>> Jeff Squyres wrote:
>>>>> Did you get a correct MPI install section for mpich2?
>>>>>
>>>>> On Aug 28, 2007, at 9:05 AM, Tim Prins wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I am working with the jms branch, and when trying to use mpich2,
>>>>>> I get
>>>>>> the following submit error:
>>>>>>
>>>>>> *** WARNING: MTTDatabase server notice:
>>>>>> mpi_install_section_name is
>>>>>> not in
>>>>>> mtt database.
>>>>>> MTTDatabase server notice: number_of_results is not in mtt
>>>>>> database.
>>>>>> MTTDatabase server notice: phase is not in mtt database.
>>>>>> MTTDatabase server notice: test_type is not in mtt database.
>>>>>> MTTDatabase server notice: test_build_section_name is not in
>>>>>> mtt
>>>>>> database.
>>>>>> MTTDatabase server notice: variant is not in mtt database.
>>>>>> MTTDatabase server notice: command is not in mtt database.
>>>>>> MTTDatabase server notice: fields is not in mtt database.
>>>>>> MTTDatabase server notice: resource_manager is not in mtt
>>>>>> database.
>>>>>>
>>>>>> MTT submission for test run
>>>>>> MTTDatabase server notice: Invalid test_build_id (47368)
>>>>>> given.
>>>>>> Guessing that it should be -1
>>>>>> MTTDatabase server error: ERROR: Unable to find a
>>>>>> test_build to
>>>>>> associate with this test_run.
>>>>>>
>>>>>> MTTDatabase abort: (Tried to send HTTP error) 400
>>>>>> MTTDatabase abort:
>>>>>> No test_build associated with this test_run
>>>>>> *** WARNING: MTTDatabase did not get a serial; phases will be
>>>>>> isolated from
>>>>>> each other in the reports
>>>>>>>> Reported to MTTDatabase: 1 successful submit, 0 failed submits
>>>>>>>> (total of
>>>>>> 12 results)
>>>>>>
>>>>>> This happens for each test run section.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Tim
>>>>>> _______________________________________________
>>>>>> mtt-users mailing list
>>>>>> mtt-users_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
>>>>>
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> mtt-users mailing list
>>>> mtt-users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
>>>
>>>
>>> --
>>> Jeff Squyres
>>> Cisco Systems
>>>
>>> _______________________________________________
>>> mtt-users mailing list
>>> mtt-users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
>>
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>> _______________________________________________
>> mtt-users mailing list
>> mtt-users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
>
> _______________________________________________
> mtt-users mailing list
> mtt-users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users