Subject: Re: [MTT users] Can not find my testing results in OMPI MTT DB
From: Ethan Mallove (ethan.mallove_at_[hidden])
Date: 2008-07-08 17:05:08


If you are running on the trunk, there's this Test run INI
parameter:

  submit_results_after_each = 1

That tells MTT to submit results after each new test
executable. This might add significant overhead, but at
least you'd have some results in the DB to look at. I'll go
ahead and add "submit_results_after_n_tests". (Is that an
okay name?) E.g., this would submit results after every 100
new test executables.

  submit_results_after_n_tests = 100

-Ethan

On Tue, Jul/08/2008 04:11:46PM, Jeff Squyres wrote:
> ...and/or we could finally make the client capable of breaking up large
> sets of submit data into multiple submits to the server. :-)
>
> I unfortunately have no cycles to work on this, but it shouldn't be *too*
> hard to do... E.g., the client can loop over submitting N results at a
> time until all results have been submitted.
>
>
> On Jul 8, 2008, at 4:07 PM, Ethan Mallove wrote:
>
>> I see a bunch of these errors in the httpd logs today:
>>
>> PHP Fatal error: Allowed memory size of 33554432
>>
>> Could you send your INI file? One way around this issue is
>> to submit fewer test results at one time by breaking up your
>> large MTT run into multiple runs. If you're running on the
>> trunk, this parameter will submit results after each run.
>>
>> -Ethan
>>
>>
>> On Tue, Jul/08/2008 10:22:47AM, Pavel Shamis (Pasha) wrote:
>>> Hi All,
>>> I still see that some time MTT (http://www.open-mpi.org/mtt) "lose" my
>>> testing results, in the local log I see:
>>>
>>> ### Test progress: 474 of 474 section tests complete (100%)
>>> Submitting to MTTDatabase...
>>> MTTDatabase client trying proxy: / Default (none)
>>> MTTDatabase proxy successful / not 500
>>> MTTDatabase response is a success
>>> MTTDatabase client got response:
>>> *** WARNING: MTTDatabase client did not get a serial; phases will be
>>> isolated from each other in the reports
>>> MTTDatabase client submit complete
>>>
>>> And I can not find these results in DB.
>>> Is it any progress with this issue ?
>>>
>>>
>>> Regards.
>>> Pasha
>>>
>>> Ethan Mallove wrote:
>>>> On Wed, May/21/2008 09:53:11PM, Pavel Shamis (Pasha) wrote:
>>>>
>>>>> Oops, in the "MTT server side problem" we discussed other issue.
>>>>>
>>>>> But anyway I did not see the problem on my server after the upgrade :)
>>>>>
>>>>
>>>>
>>>> We took *some* steps to alleviate the PHP memory overload
>>>> problem (e.g., r668, and then r1119), but evidently there's
>>>> more work to do :-)
>>>>
>>>>
>>>>> Pasha
>>>>>
>>>>> Pavel Shamis (Pasha) wrote:
>>>>>
>>>>>> I had similar problem on my server. I upgraded the server to latest
>>>>>> trunk and the problem disappear.
>>>>>> (see "MTT server side problem" thread).
>>>>>>
>>>>>> Pasha
>>>>>>
>>>>>> Jeff Squyres wrote:
>>>>>>
>>>>>>> FWIW: I think we have at least one open ticket on this issue (break
>>>>>>> up
>>>>>>> submits so that we don't overflow PHP and/or apache).
>>>>>>>
>>>>
>>>> https://svn.open-mpi.org/trac/mtt/ticket/221
>>>>
>>>> -Ethan
>>>>
>>>>
>>>>>>> On May 21, 2008, at 2:36 PM, Ethan Mallove wrote:
>>>>>>>
>>>>>>>
>>>>>>>> On Wed, May/21/2008 06:46:06PM, Pavel Shamis (Pasha) wrote:
>>>>>>>>
>>>>>>>>> I sent it directly to your email. Please check.
>>>>>>>>> Thanks,
>>>>>>>>> Pasha
>>>>>>>>>
>>>>>>>> Got it. Thanks. It's a PHP memory overload issue.
>>>>>>>> (Apparently I didn't look far back enough in the httpd
>>>>>>>> error_logs.) See below.
>>>>>>>>
>>>>>>>>
>>>>>>>>> Ethan Mallove wrote:
>>>>>>>>>
>>>>>>>>>> On Wed, May/21/2008 05:19:44PM, Pavel Shamis (Pasha) wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Jeff Squyres wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Are we running into http max memory problems or http max upload
>>>>>>>>>>>> size
>>>>>>>>>>>> problems again?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> I guess it is some server side issue, you need to check the
>>>>>>>>>>> /var/log/httpd/* log on the server.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> The only thing I found in the httpd logs
>>>>>>>>>> (/var/log/httpd/www.open-mpi.org/error_log*) was this PHP
>>>>>>>>>> warning, which I don't think would result in lost results:
>>>>>>>>>>
>>>>>>>>>> PHP Warning: array_shift(): The argument should be an array in
>>>>>>>>>> .../submit/index.php on line 1683
>>>>>>>>>>
>>>>>>>>>> I haven't received any emailed Postgres errors either. When
>>>>>>>>>> were these results submitted? I searched for "mellanox" over
>>>>>>>>>> the past four days. It seem the results aren't buried in
>>>>>>>>>> here, because I see no test run failures ...
>>>>>>>>>>
>>>>>>>>>> http://www.open-mpi.org/mtt/index.php?do_redir=659
>>>>>>>>>>
>>>>>>>>>> I'm assuming you're running with two Reporter INI sections:
>>>>>>>>>> Textfile and MTTDatabase? Can you send some MTT client
>>>>>>>>>> --verbose/--debug output from the below runs?
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Ethan
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>> On May 21, 2008, at 5:28 AM, Pavel Shamis (Pasha) wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here is test result from my last mtt run:
>>>>>>>>>>>>> +-------------+----------------+------+------+----------+------+
>>>>>>>>>>>>> | Phase | Section | Pass | Fail | Time out | Skip
>>>>>>>>>>>>> |
>>>>>>>>>>>>> +-------------+----------------+------+------+----------+------+
>>>>>>>>>>>>> | MPI install | ompi/gcc | 1 | 0 | 0 | 0
>>>>>>>>>>>>> |
>>>>>>>>>>>>> | MPI install | ompi/intel-9.0 | 1 | 0 | 0 | 0
>>>>>>>>>>>>> |
>>>>>>>>>>>>> | Test Build | trivial | 1 | 0 | 0 | 0
>>>>>>>>>>>>> |
>>>>>>>>>>>>> | Test Build | trivial | 1 | 0 | 0 | 0
>>>>>>>>>>>>> |
>>>>>>>>>>>>> | Test Build | intel-suite | 1 | 0 | 0 | 0
>>>>>>>>>>>>> |
>>>>>>>>>>>>> | Test Build | intel-suite | 1 | 0 | 0 | 0
>>>>>>>>>>>>> |
>>>>>>>>>>>>> | Test Build | imb | 1 | 0 | 0 | 0
>>>>>>>>>>>>> |
>>>>>>>>>>>>> | Test Build | imb | 1 | 0 | 0 | 0
>>>>>>>>>>>>> |
>>>>>>>>>>>>> | Test Build | presta | 1 | 0 | 0 | 0
>>>>>>>>>>>>> |
>>>>>>>>>>>>> | Test Build | presta | 1 | 0 | 0 | 0
>>>>>>>>>>>>> |
>>>>>>>>>>>>> | Test Build | osu_benchmarks | 1 | 0 | 0 | 0
>>>>>>>>>>>>> |
>>>>>>>>>>>>> | Test Build | osu_benchmarks | 1 | 0 | 0 | 0
>>>>>>>>>>>>> |
>>>>>>>>>>>>> | Test Build | netpipe | 1 | 0 | 0 | 0
>>>>>>>>>>>>> |
>>>>>>>>>>>>> | Test Build | netpipe | 1 | 0 | 0 | 0
>>>>>>>>>>>>> |
>>>>>>>>>>>>> | Test Run | trivial | 64 | 0 | 0 | 0
>>>>>>>>>>>>> |
>>>>>>>>>>>>> | Test Run | trivial | 64 | 0 | 0 | 0
>>>>>>>>>>>>> |
>>>>>>>>>>>>> | Test Run | intel-suite | 3179 | 165 | 400 | 0
>>>>>>>>>>>>> |
>>>>>>>>>>>>> | Test Run | intel-suite | 492 | 0 | 0 | 0
>>>>>>>>>>>>> |
>>>>>>>>>>>>> +-------------+----------------+------+------+----------+------+
>>>>>>>>>>>>>
>>>>>>>>>>>>> In the OMPI MTT DB (http://www.open-mpi.org/mtt) I found the
>>>>>>>>>>>>> follow
>>>>>>>>>>>>> "test run" results:
>>>>>>>>>>>>> | Test Run | trivial | 64 | 0 | 0 | 0
>>>>>>>>>>>>> |
>>>>>>>>>>>>> | Test Run | trivial | 64 | 0 | 0 | 0
>>>>>>>>>>>>> |
>>>>>>>>>>>>> | Test Run | intel-suite | 492 | 0 | 0 | 0
>>>>>>>>>>>>> |
>>>>>>>>>>>>>
>>>>>>>>>>>>> And I can not find this one:
>>>>>>>>>>>>> | Test Run | intel-suite | 3179 | 165 | 400 | 0
>>>>>>>>>>>>> |
>>>>>>>>>>>>>
>>>>>>>> Some missing results are in mttdb_debug_file.16.txt (and
>>>>>>>> 17.txt), which are the largest .txt files of the bunch. 8
>>>>>>>> variants isn't that much, but maybe it causes a problem when
>>>>>>>> there's lots of stderr/stdout? I'm surprised
>>>>>>>> submit/index.php barfs on files this size:
>>>>>>>>
>>>>>>>> $ ls -l
>>>>>>>> ...
>>>>>>>> -rw-r--r-- 1 em162155 staff 956567 May 21 14:21
>>>>>>>> mttdb_debug_file.16.inc.gz
>>>>>>>> -rw-r--r-- 1 em162155 staff 9603132 May 21 14:09
>>>>>>>> mttdb_debug_file.16.txt
>>>>>>>> ...
>>>>>>>>
>>>>>>>> $ client/mtt-submit -h www.open-mpi.org -f mttdb_debug_file.16.txt
>>>>>>>> -z
>>>>>>>> -u sun -p sun4sun -d
>>>>>>>> LWP::UserAgent::new: ()
>>>>>>>> LWP::UserAgent::proxy: http
>>>>>>>>
>>>>>>>> Filelist: $VAR1 = 'mttdb_debug_file.16.txt';
>>>>>>>> LWP::MediaTypes::read_media_types: Reading media types from
>>>>>>>> /ws/ompi-tools/lib/perl5/5.8.8/LWP/media.types
>>>>>>>> LWP::MediaTypes::read_media_types: Reading media types from
>>>>>>>> /usr/perl5/site_perl/5.8.4/LWP/media.types
>>>>>>>> LWP::MediaTypes::read_media_types: Reading media types from
>>>>>>>> /home/em162155/.mime.types
>>>>>>>> LWP::UserAgent::request: ()
>>>>>>>> LWP::UserAgent::send_request: POST
>>>>>>>> http://www.open-mpi.org/mtt/submit/index.php
>>>>>>>> LWP::UserAgent::_need_proxy: Not proxied
>>>>>>>> LWP::Protocol::http::request: ()
>>>>>>>> LWP::UserAgent::request: Simple response: OK
>>>>>>>>
>>>>>>>> $ tail -f /var/log/httpd/www.open-mpi.org/error_log | grep -w submit
>>>>>>>> ...
>>>>>>>> [client 192.18.128.5] PHP Fatal error: Allowed memory size of
>>>>>>>> 33554432 bytes exhausted (tried to allocate 14 bytes) in
>>>>>>>> /nfs/rontok/xraid/data/osl/www/www.open-mpi.org/mtt/submit/index.php
>>>>>>>> on line 1559
>>>>>>>> ...
>>>>>>>>
>>>>>>>> We'll have to somehow be more efficient on these loops.
>>>>>>>> E.g., line 1559:
>>>>>>>>
>>>>>>>> foreach (array_keys($_POST) as $k) {
>>>>>>>>
>>>>>>>> Maybe if we broke $_POST up into multiple parts (e.g.,
>>>>>>>> $_POST_1, $_POST_2, ... $_POST_N)? Maybe we could do
>>>>>>>> something more efficient than array_keys here? I'm not sure.
>>>>>>>>
>>>>>>>> The only workaround on the client side would be to break up
>>>>>>>> the runs. Maybe do a single MPI Install at a time? Do
>>>>>>>> ompi/gcc then ompi/intel-9.0 as seperate invocations of the
>>>>>>>> MTT client.
>>>>>>>>
>>>>>>>> Sorry :-(
>>>>>>>>
>>>>>>>> -Ethan
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>>>>> From the log I see that all tests results were submitted
>>>>>>>>>>>>> successfully.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Can you please check ?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Pasha
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> mtt-users mailing list
>>>>>>>>>>>>> mtt-users_at_[hidden]
>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> mtt-users mailing list
>>>>>>>>>>> mtt-users_at_[hidden]
>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> mtt-users mailing list
>>>>>>>>> mtt-users_at_[hidden]
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
>>>>>>>>>
>>>>>>>
>>>>>> _______________________________________________
>>>>>> mtt-users mailing list
>>>>>> mtt-users_at_[hidden]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
>>>>>>
>>>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> mtt-users mailing list
>>> mtt-users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
>
>
> --
> Jeff Squyres
> Cisco Systems
>