Subject: Re: [MTT users] Can not find my testing results in OMPI MTT DB
From: Pavel Shamis (Pasha) (pasha_at_[hidden])
Date: 2008-07-09 03:55:42


Ethan Mallove wrote:
> If you are running on the trunk, there's this Test run INI
> parameter:
>
> submit_results_after_each = 1
>
> That tells MTT to submit results after each new test
> executable. This might add significant overhead, but at
> least you'd have some results in the DB to look at. I'll go
> ahead and add "submit_results_after_n_tests". (Is that an
> okay name?) E.g., this would submit results after every 100
> new test executables.
>
> submit_results_after_n_tests = 100
>

The "submit_results_after_n_tests" solution sounds good for me.
In which section I should define it ?
Pasha

> On Tue, Jul/08/2008 04:11:46PM, Jeff Squyres wrote:
>
>> ...and/or we could finally make the client capable of breaking up large
>> sets of submit data into multiple submits to the server. :-)
>>
>> I unfortunately have no cycles to work on this, but it shouldn't be *too*
>> hard to do... E.g., the client can loop over submitting N results at a
>> time until all results have been submitted.
>>
>>
>> On Jul 8, 2008, at 4:07 PM, Ethan Mallove wrote:
>>
>>
>>> I see a bunch of these errors in the httpd logs today:
>>>
>>> PHP Fatal error: Allowed memory size of 33554432
>>>
>>> Could you send your INI file? One way around this issue is
>>> to submit fewer test results at one time by breaking up your
>>> large MTT run into multiple runs. If you're running on the
>>> trunk, this parameter will submit results after each run.
>>>
>>> -Ethan
>>>
>>>
>>> On Tue, Jul/08/2008 10:22:47AM, Pavel Shamis (Pasha) wrote:
>>>
>>>> Hi All,
>>>> I still see that some time MTT (http://www.open-mpi.org/mtt) "lose" my
>>>> testing results, in the local log I see:
>>>>
>>>> ### Test progress: 474 of 474 section tests complete (100%)
>>>> Submitting to MTTDatabase...
>>>> MTTDatabase client trying proxy: / Default (none)
>>>> MTTDatabase proxy successful / not 500
>>>> MTTDatabase response is a success
>>>> MTTDatabase client got response:
>>>> *** WARNING: MTTDatabase client did not get a serial; phases will be
>>>> isolated from each other in the reports
>>>> MTTDatabase client submit complete
>>>>
>>>> And I can not find these results in DB.
>>>> Is it any progress with this issue ?
>>>>
>>>>
>>>> Regards.
>>>> Pasha
>>>>
>>>> Ethan Mallove wrote:
>>>>
>>>>> On Wed, May/21/2008 09:53:11PM, Pavel Shamis (Pasha) wrote:
>>>>>
>>>>>
>>>>>> Oops, in the "MTT server side problem" we discussed other issue.
>>>>>>
>>>>>> But anyway I did not see the problem on my server after the upgrade :)
>>>>>>
>>>>>>
>>>>> We took *some* steps to alleviate the PHP memory overload
>>>>> problem (e.g., r668, and then r1119), but evidently there's
>>>>> more work to do :-)
>>>>>
>>>>>
>>>>>
>>>>>> Pasha
>>>>>>
>>>>>> Pavel Shamis (Pasha) wrote:
>>>>>>
>>>>>>
>>>>>>> I had similar problem on my server. I upgraded the server to latest
>>>>>>> trunk and the problem disappear.
>>>>>>> (see "MTT server side problem" thread).
>>>>>>>
>>>>>>> Pasha
>>>>>>>
>>>>>>> Jeff Squyres wrote:
>>>>>>>
>>>>>>>
>>>>>>>> FWIW: I think we have at least one open ticket on this issue (break
>>>>>>>> up
>>>>>>>> submits so that we don't overflow PHP and/or apache).
>>>>>>>>
>>>>>>>>
>>>>> https://svn.open-mpi.org/trac/mtt/ticket/221
>>>>>
>>>>> -Ethan
>>>>>
>>>>>
>>>>>
>>>>>>>> On May 21, 2008, at 2:36 PM, Ethan Mallove wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> On Wed, May/21/2008 06:46:06PM, Pavel Shamis (Pasha) wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> I sent it directly to your email. Please check.
>>>>>>>>>> Thanks,
>>>>>>>>>> Pasha
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> Got it. Thanks. It's a PHP memory overload issue.
>>>>>>>>> (Apparently I didn't look far back enough in the httpd
>>>>>>>>> error_logs.) See below.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Ethan Mallove wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> On Wed, May/21/2008 05:19:44PM, Pavel Shamis (Pasha) wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Jeff Squyres wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Are we running into http max memory problems or http max upload
>>>>>>>>>>>>> size
>>>>>>>>>>>>> problems again?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>> I guess it is some server side issue, you need to check the
>>>>>>>>>>>> /var/log/httpd/* log on the server.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> The only thing I found in the httpd logs
>>>>>>>>>>> (/var/log/httpd/www.open-mpi.org/error_log*) was this PHP
>>>>>>>>>>> warning, which I don't think would result in lost results:
>>>>>>>>>>>
>>>>>>>>>>> PHP Warning: array_shift(): The argument should be an array in
>>>>>>>>>>> .../submit/index.php on line 1683
>>>>>>>>>>>
>>>>>>>>>>> I haven't received any emailed Postgres errors either. When
>>>>>>>>>>> were these results submitted? I searched for "mellanox" over
>>>>>>>>>>> the past four days. It seem the results aren't buried in
>>>>>>>>>>> here, because I see no test run failures ...
>>>>>>>>>>>
>>>>>>>>>>> http://www.open-mpi.org/mtt/index.php?do_redir=659
>>>>>>>>>>>
>>>>>>>>>>> I'm assuming you're running with two Reporter INI sections:
>>>>>>>>>>> Textfile and MTTDatabase? Can you send some MTT client
>>>>>>>>>>> --verbose/--debug output from the below runs?
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Ethan
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>> On May 21, 2008, at 5:28 AM, Pavel Shamis (Pasha) wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Here is test result from my last mtt run:
>>>>>>>>>>>>>> +-------------+----------------+------+------+----------+------+
>>>>>>>>>>>>>> | Phase | Section | Pass | Fail | Time out | Skip
>>>>>>>>>>>>>> |
>>>>>>>>>>>>>> +-------------+----------------+------+------+----------+------+
>>>>>>>>>>>>>> | MPI install | ompi/gcc | 1 | 0 | 0 | 0
>>>>>>>>>>>>>> |
>>>>>>>>>>>>>> | MPI install | ompi/intel-9.0 | 1 | 0 | 0 | 0
>>>>>>>>>>>>>> |
>>>>>>>>>>>>>> | Test Build | trivial | 1 | 0 | 0 | 0
>>>>>>>>>>>>>> |
>>>>>>>>>>>>>> | Test Build | trivial | 1 | 0 | 0 | 0
>>>>>>>>>>>>>> |
>>>>>>>>>>>>>> | Test Build | intel-suite | 1 | 0 | 0 | 0
>>>>>>>>>>>>>> |
>>>>>>>>>>>>>> | Test Build | intel-suite | 1 | 0 | 0 | 0
>>>>>>>>>>>>>> |
>>>>>>>>>>>>>> | Test Build | imb | 1 | 0 | 0 | 0
>>>>>>>>>>>>>> |
>>>>>>>>>>>>>> | Test Build | imb | 1 | 0 | 0 | 0
>>>>>>>>>>>>>> |
>>>>>>>>>>>>>> | Test Build | presta | 1 | 0 | 0 | 0
>>>>>>>>>>>>>> |
>>>>>>>>>>>>>> | Test Build | presta | 1 | 0 | 0 | 0
>>>>>>>>>>>>>> |
>>>>>>>>>>>>>> | Test Build | osu_benchmarks | 1 | 0 | 0 | 0
>>>>>>>>>>>>>> |
>>>>>>>>>>>>>> | Test Build | osu_benchmarks | 1 | 0 | 0 | 0
>>>>>>>>>>>>>> |
>>>>>>>>>>>>>> | Test Build | netpipe | 1 | 0 | 0 | 0
>>>>>>>>>>>>>> |
>>>>>>>>>>>>>> | Test Build | netpipe | 1 | 0 | 0 | 0
>>>>>>>>>>>>>> |
>>>>>>>>>>>>>> | Test Run | trivial | 64 | 0 | 0 | 0
>>>>>>>>>>>>>> |
>>>>>>>>>>>>>> | Test Run | trivial | 64 | 0 | 0 | 0
>>>>>>>>>>>>>> |
>>>>>>>>>>>>>> | Test Run | intel-suite | 3179 | 165 | 400 | 0
>>>>>>>>>>>>>> |
>>>>>>>>>>>>>> | Test Run | intel-suite | 492 | 0 | 0 | 0
>>>>>>>>>>>>>> |
>>>>>>>>>>>>>> +-------------+----------------+------+------+----------+------+
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In the OMPI MTT DB (http://www.open-mpi.org/mtt) I found the
>>>>>>>>>>>>>> follow
>>>>>>>>>>>>>> "test run" results:
>>>>>>>>>>>>>> | Test Run | trivial | 64 | 0 | 0 | 0
>>>>>>>>>>>>>> |
>>>>>>>>>>>>>> | Test Run | trivial | 64 | 0 | 0 | 0
>>>>>>>>>>>>>> |
>>>>>>>>>>>>>> | Test Run | intel-suite | 492 | 0 | 0 | 0
>>>>>>>>>>>>>> |
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> And I can not find this one:
>>>>>>>>>>>>>> | Test Run | intel-suite | 3179 | 165 | 400 | 0
>>>>>>>>>>>>>> |
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>> Some missing results are in mttdb_debug_file.16.txt (and
>>>>>>>>> 17.txt), which are the largest .txt files of the bunch. 8
>>>>>>>>> variants isn't that much, but maybe it causes a problem when
>>>>>>>>> there's lots of stderr/stdout? I'm surprised
>>>>>>>>> submit/index.php barfs on files this size:
>>>>>>>>>
>>>>>>>>> $ ls -l
>>>>>>>>> ...
>>>>>>>>> -rw-r--r-- 1 em162155 staff 956567 May 21 14:21
>>>>>>>>> mttdb_debug_file.16.inc.gz
>>>>>>>>> -rw-r--r-- 1 em162155 staff 9603132 May 21 14:09
>>>>>>>>> mttdb_debug_file.16.txt
>>>>>>>>> ...
>>>>>>>>>
>>>>>>>>> $ client/mtt-submit -h www.open-mpi.org -f mttdb_debug_file.16.txt
>>>>>>>>> -z
>>>>>>>>> -u sun -p sun4sun -d
>>>>>>>>> LWP::UserAgent::new: ()
>>>>>>>>> LWP::UserAgent::proxy: http
>>>>>>>>>
>>>>>>>>> Filelist: $VAR1 = 'mttdb_debug_file.16.txt';
>>>>>>>>> LWP::MediaTypes::read_media_types: Reading media types from
>>>>>>>>> /ws/ompi-tools/lib/perl5/5.8.8/LWP/media.types
>>>>>>>>> LWP::MediaTypes::read_media_types: Reading media types from
>>>>>>>>> /usr/perl5/site_perl/5.8.4/LWP/media.types
>>>>>>>>> LWP::MediaTypes::read_media_types: Reading media types from
>>>>>>>>> /home/em162155/.mime.types
>>>>>>>>> LWP::UserAgent::request: ()
>>>>>>>>> LWP::UserAgent::send_request: POST
>>>>>>>>> http://www.open-mpi.org/mtt/submit/index.php
>>>>>>>>> LWP::UserAgent::_need_proxy: Not proxied
>>>>>>>>> LWP::Protocol::http::request: ()
>>>>>>>>> LWP::UserAgent::request: Simple response: OK
>>>>>>>>>
>>>>>>>>> $ tail -f /var/log/httpd/www.open-mpi.org/error_log | grep -w submit
>>>>>>>>> ...
>>>>>>>>> [client 192.18.128.5] PHP Fatal error: Allowed memory size of
>>>>>>>>> 33554432 bytes exhausted (tried to allocate 14 bytes) in
>>>>>>>>> /nfs/rontok/xraid/data/osl/www/www.open-mpi.org/mtt/submit/index.php
>>>>>>>>> on line 1559
>>>>>>>>> ...
>>>>>>>>>
>>>>>>>>> We'll have to somehow be more efficient on these loops.
>>>>>>>>> E.g., line 1559:
>>>>>>>>>
>>>>>>>>> foreach (array_keys($_POST) as $k) {
>>>>>>>>>
>>>>>>>>> Maybe if we broke $_POST up into multiple parts (e.g.,
>>>>>>>>> $_POST_1, $_POST_2, ... $_POST_N)? Maybe we could do
>>>>>>>>> something more efficient than array_keys here? I'm not sure.
>>>>>>>>>
>>>>>>>>> The only workaround on the client side would be to break up
>>>>>>>>> the runs. Maybe do a single MPI Install at a time? Do
>>>>>>>>> ompi/gcc then ompi/intel-9.0 as seperate invocations of the
>>>>>>>>> MTT client.
>>>>>>>>>
>>>>>>>>> Sorry :-(
>>>>>>>>>
>>>>>>>>> -Ethan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>>>> From the log I see that all tests results were submitted
>>>>>>>>>>>>>> successfully.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can you please check ?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Pasha
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> mtt-users mailing list
>>>>>>>>>>>>>> mtt-users_at_[hidden]
>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> mtt-users mailing list
>>>>>>>>>>>> mtt-users_at_[hidden]
>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> mtt-users mailing list
>>>>>>>>>> mtt-users_at_[hidden]
>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
>>>>>>>>>>
>>>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> mtt-users mailing list
>>>>>>> mtt-users_at_[hidden]
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>> _______________________________________________
>>>> mtt-users mailing list
>>>> mtt-users_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
>>>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>>
>
>