Open MPI logo

MTT Devel Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all MTT Devel mailing list

Subject: Re: [MTT devel] Reporter Slowness
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2008-01-31 12:57:24


Thanks Josh!

On Jan 31, 2008, at 12:00 PM, Josh Hursey wrote:

> Ok so the script is done. It took a bit longer than I had expected,
> but when it finished then things sped back up ('24 hours' of data in 6
> sec). There are a few more maintenance operations I want to run which
> will help out a bit more, but I'll push those to this weekend.
>
> Thanks for your patience, and let me know if it feels sluggish again.
> So as of this email things should be back to normal.
>
> Cheers,
> Josh
>
> On Jan 30, 2008, at 5:09 PM, Josh Hursey wrote:
>
>> I've started the script running.
>>
>> Below is a short version, and a trilogy of the gory details. I wanted
>> to write up the details so if it ever happens again to us (or someone
>> else) they can see what we did to fix it.
>>
>>
>> The Short Version:
>> ------------------
>> The Slowness(tm) was caused by the recent shifting of data in the
>> database to resolve the partition table problems seen earlier this
>> month.
>>
>> The bad news is that it will take about 14 hours to finish.
>>
>> The good news is that I confirmed that this will fix the performance
>> problem that we are seeing. In the small run this technique reduce
>> the
>> '24 hour' query execution time from ~40 sec back down to ~8 sec.
>>
>> This may slow down client submits this evening, but should not
>> prevent
>> them from being able to submit. The 'DELETE' operations do not
>> require
>> an exclusive lock, so the 'INSERT' operations should proceed fine
>> concurrently. The 'INSERT' operations will need to be blocked while
>> the 'VACUUM FULL' operation is progressing since it *does* require an
>> exclusive lock. The 'INSERT' operations will proceed normally once
>> this lock is released resulting in a temporary slowdown for clients
>> that submit during these windows of time (about 20 min or so).
>>
>>
>>
>> The Details: Part 1: What I did earlier this week:
>> (more than you wanted to know for prosperity purposes)
>> --------------------------------------------------
>> The original problem was that the master partition tables accidently
>> started storing data because I forgot to load the 2008 partition
>> tables into the database before the first of the year. :( So we
>> loaded
>> the partition tables, but we still needed to move the misplaced data.
>>
>> To move the misplaced data we have to duplicate the row (so it is
>> stored properly this time), but we also need to take care in
>> assigning
>> row IDs to the duplicate rows. We cannot give the dup'ed rows the
>> same
>> ID or we will be unable to differentiate the original and the dup'ed
>> row. So I created a dummy table for mpi_install/test_build/test_run
>> to
>> translate between the orig row ID and the dup'ed row ID. I used the
>> nextval on the sequence to populate the values for the dup'ed rows in
>> the dummy table.
>>
>> Now that I had translation I joined the dummy table with it's
>> corresponding master table (e.g. "mpi_install join mpi_install_dummy
>> on mpi_install.mpi_install_id = mpi_install_dummy.orig_id"), and
>> instead of selecting the original ID from the dummy table I selected
>> the new dup'ed ID. I inserted this selection back in to the
>> mpi_install table. (Cool little trick that PostgreSQL lets you get
>> away with sometimes).
>>
>> Once I have duplicated all of the effected rows, then I updated all
>> references to the original ID and set it to the duplicated ID in the
>> test_build/test_run tables. This removed all internal reference to
>> the
>> original ID, and replaced it with the duplicated so we retain
>> integrity of the data.
>>
>> Once I have verified that no tables references the original row I
>> delete those rows from the mpi_install/test_build/test_run tables.
>>
>>
>>
>> The Details: Part 2: What I forgot to do:
>> -----------------------------------------
>> When rows are deleted from PostgreSQL the disk space used continues
>> to
>> be reserved for this table, and is not reclaimed unless you 'VACUUM
>> FULL' this table. PostgreSQL does this for many good reasons which
>> are
>> described in their documentation. However in the case of the master
>> partition tables we want them to release all of their disk space
>> since
>> we should never be storing data in this particular table.
>>
>> I did a 'VACUUM FULL' on the mpi_install and test_build tables
>> originally, but did not do it on the test_run table since this
>> operation requires an exclusive lock on the table and can take a long
>> time to finish. Further I only completed about 1% of the deletions
>> for
>> test_run before I stopped this operation choosing to wait for the
>> weekend since it will take a long time to complete.
>>
>> By only deleting part of the test_run master table (which contained
>> about 1.2 Million rows) this caused the queries on this table to slow
>> down considerably. The Query Planner estimated the execution of the
>> '24 hour' query at 322,924 and it completed in about 40 seconds. I
>> ran
>> 'VACUUM FULL test_run' which only Vacuums the master table, and then
>> re-ran the query. This time the Query Planner estimated the execution
>> at 151,430 and it completed in about 8 seconds.
>>
>>
>>
>> The Details: Part 3: What I am doing now:
>> -----------------------------------------
>> Currently I am deleting the rest of the old rows from test_run. There
>> are approx. 1.2 million rows, and this should complete in about 13
>> hours.
>>
>> After every 100 K deletions I'm running a 'VACUUM FULL' on test_run.
>> My hope is that by doing it this way instead of just once at the end
>> of all 1.2 M will cause each one to take less time. I hope this will
>> limit the interruptions seen by the MTT clients submitting results
>> this evening.
>>
>> I'll send email once the script is complete, and things seem back to
>> normal.
>>
>> Cheers,
>> Josh
>>
>> On Jan 30, 2008, at 4:12 PM, Jeff Squyres wrote:
>>
>>> I'd go ahead and do it now.
>>>
>>> On Jan 30, 2008, at 4:04 PM, Josh Hursey wrote:
>>>
>>>> It seems the reporter has gotten slower :( Now it is working in the
>>>> range of 40 - 50 seconds for the 24 hour query which is not
>>>> reasonable. This should be much lower.
>>>>
>>>> Looking at the explain of the query I have some ideas on how to
>>>> make
>>>> things better, but this will slow things down a for a while as I do
>>>> this work (maybe a day or two, can't say for sure).
>>>>
>>>> The question is should I wait until Friday COB to start this or
>>>> should
>>>> I do it immediately?
>>>>
>>>> Let me know,
>>>> Josh
>>>> _______________________________________________
>>>> mtt-devel mailing list
>>>> mtt-devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel
>>>
>>>
>>> --
>>> Jeff Squyres
>>> Cisco Systems
>>>
>>> _______________________________________________
>>> mtt-devel mailing list
>>> mtt-devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel
>>
>> _______________________________________________
>> mtt-devel mailing list
>> mtt-devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel
>
> _______________________________________________
> mtt-devel mailing list
> mtt-devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel

-- 
Jeff Squyres
Cisco Systems