Ethan / Josh --
The HDF guys are interested in potentially using MTT. They have some questions about the database. Can you guys take a whack at answering them? (be sure to keep the CC, as Elena/Quincey aren't on the list)
On Nov 3, 2010, at 1:29 PM, Quincey Koziol wrote:
> Lots of interest here about MTT, thanks again for taking time to demo it and talk to us!
Glad to help.
> One lasting concern was the slowness of the report queries - what's the controlling parameter there? Is it the number of tests, the size of the output, the number of configurations of each test, etc?
All of the above. On a good night, Cisco dumps in 250k test runs to the database. That's just a boatload of data. End result: the database is *HUGE*. Running queries just takes time.
If the database wasn't so huge, the queries wouldn't take nearly as long. The size of the database is basically how much data you put into it -- so it's really a function of everything you mentioned. I.e., increasing any one of those items increases the size of the database. Our database is *huge* -- the DB guys tell me that it's lots and lots of little data (with blobs of stdout/stderr here an there) that make it "huge", in SQL terms.
Josh did some great work a few summers back that basically "fixed" the speed of the queries to a set speed by effectively dividing up all the data into month-long chunks in the database. The back-end of the web reporter only queries the relevant month chunks in the database (I think this is a postgres-specific SQL feature).
Additionally, we have the DB server on a fairly underpowered machine that is shared with a whole pile of other server duties (www.open-mpi.org, mailman, ...etc.). This also contributes to the slowness.
> For example, each HDF5 build includes on the order of 100 test executables, and we run 50 or so configurations each night. How would that compare with the OpenMPI test results database?
Good question. I'm CC'ing the mtt-devel list to see if Josh or Ethan could comment on this more intelligently than me -- they did almost all of the database work, not me.
I'm *guessing* that it won't come anywhere close to the size of the Open MPI database (we haven't trimmed the data in the OMPI database since we started gathering data in the database several years ago).
For corporate legal information go to: