Open MPI logo

MTT Devel Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all MTT Devel mailing list

Subject: Re: [MTT devel] GSOC application
From: Ethan Mallove (ethan.mallove_at_[hidden])
Date: 2009-04-13 10:44:33


On Mon, Apr/13/2009 04:15:23PM, Mike Dubman wrote:
> Hello Guys,
>
> Please comment on the proposed object model and flows. We will have 1-2
> ppl working on this in a 2-3w. Till that moment I would like to finalize
> the scope and flows.
>
> Thanks
>
> Mike.
>
> On Mon, Apr 6, 2009 at 4:54 PM, Mike Dubman <mike.ompi_at_[hidden]> wrote:
>
> Hello Guys,
>
> I have played a bit with google datastore and here is a proposal for mtt
> DB infra and some accompanying tools for submission and querying:
>
> 1. Scope and requirements
> ====================
>
> a. provide storage services for storing test results generated by mtt.
> Storage services will be implemented over datastore.
> b. provide storage services for storing benchmarking results generated
> by various mpi based applications* (not mtt based, for example: fluent,
> openfoam)
> c. test or benchmarking results stored in the datastore can be grouped
> and referred as a group (for example: mtt execution can generate many
> mtt results consisting of different phases. This mtt execution will be
> referred as a session)
> d. Benchmarking and test results which are generated by mtt or any other
> mpi based applications, can be stored in the datastore and grouped by
> some logical criteria.
> e. The mtt should not depend or call directly any datastore`s provided
> APIs. The mtt client (or framework/scripts executing mpi based
> applications) should generate test/benchmarking results in some internal
> format, which will be processed later by external tools. These external
> tools will be responsible for saving test results in the datastore. Same
> rules should be applied for non mtt based executions of mpi-based
> applications (line fluent, openfoam,...). The scripts which are wrapping
> such executions will dump benchmarking results in some internal form for
> later processing by external tools.
>
> f. The internal form for representation of test/benchmarking results can
> be XML. The external tool will receive (as cmd line params) XML files,
> process them and save to the datastore.
>
> d. The external tools will be familiar with datastore object model and
> will provide bridge between test results (XML) and actual datastore.
>
> 2. Flow and use-cases
> =================
>
> a. The mtt client will dump all test related information into XML file.
> The file will be created for every phase executed by mtt. (today there
> are many summary txt and html files generated for every test phase, it
> is pretty easy to add xml generation of the same information)

Will this translate to something like
lib/MTT/Reporter/GoogleDatabase.pm? If we are to move away from the
current MTT Postgres database, we want to be able to submit results to
both the current MTT database and the new Google database during the
transition period. Having a GoogleDatabase.pm would make this easier.

>
> b. mtt_save_to_db.py - script which will go over mtt scratch dir, find
> all xml files generated for every mtt phase, parse it and save to
> datastore, preserving test results relations,i.e. all test results will
> be grouped by mtt general info: mpi version, name, date, ....
>
> c. same script can scan, parse and save from xml files generated by
> wrapper scripts for non mtt based executions (fluent, ..)
>

I'm confused here. Can't MTT be outfitted to report results of a
Fluent run?

> d. mtt_query_db.py script will be provided with basic query capabilities
> over proposed datastore object model. Most users will prefer writing
> custom sql-like select queries for fetching results.
>
> 3. Important notes:
> ==============
>
> a. The single mtt client execution generates many result files, every
> generated file represents test phase. This file contains test results
> and can be characterized as a set of attributes with its values. Every
> test phase has its own attributes which are differ for different phases.
> For example: attributes for TestBuild phase has keys "compiler_name,
> compiler_version", the MPIInstall phase has attributes: prefix_dir,
> arch, ....
> Hence, most of the datastore objects representing phases of MTT* are
> derived from "db.Expando" model, which allows having dynamic attributes
> for its derived sub-classes.
>
> The attached is archive with a simple test for using datastore for mtt.
> Please see models.py file with proposed object model and comment.
>

I don't see the models.py attachment.

Thanks,
Ethan

> You can run the attached example in the google datastore dev
> environment. (http://code.google.com/appengine/downloads.html)
>
> Please comment.
>
> Thanks
> Mike
>
> On Tue, Mar 24, 2009 at 12:17 AM, Jeff Squyres <jsquyres_at_[hidden]>
> wrote:
>
> On Mar 23, 2009, at 9:05 AM, Ethan Mallove wrote:
>
> *-------------------+---------------------+----------
> *Resource * * * * * | Unit * * * * * * * *| Unit cost
> *-------------------+---------------------+----------
> *Outgoing Bandwidth | gigabytes * * * * * | $0.12
> *Incoming Bandwidth | gigabytes * * * * * | $0.10
> *CPU Time * * * * * | CPU hours * * * * * | $0.10
> *Stored Data * * * *| gigabytes per month | $0.15
> *Recipients Emailed | recipients * * * * *| $0.0001
> *-------------------+---------------------+----------
>
> Would we itemize the MTT bill on a per user basis? *E.g., orgs that
> use MTT more, would have to pay more?
>
> Let's assume stored data == incoming bandwidth, because we never throw
> anything away. *And let's go with the SWAG of 100GB. *We may or may
> not be able to gzip the data uploading to the server. *So if anything,
> we *might* be able to decrease the incoming data and have higher level
> of stored data.
>
> I anticipate our outgoing data to be significantly less, particularly
> if we can gzip the outgoing data (which I think we can). *You're
> right, CPU time is a mystery -- we won't know what it will be until we
> start running some queries to see what happens.
>
> 100GB * $0.10 = $10
> 100GB * $0.15 = $15
> total = $25 for the first month
>
> So let's SWAG at $25/mo for a year = $300. *This number will be wrong
> for several reasons, but it at least gives us a ballpark. *For
> $300/year, I think we (the OMPI project) can find a way to pay for
> this fairly easily.
> --
> Jeff Squyres
> Cisco Systems
>
> _______________________________________________
> mtt-devel mailing list
> mtt-devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel
>
> References
>
> Visible links
> . mailto:mike.ompi_at_[hidden]
> . http://code.google.com/appengine/downloads.html
> . mailto:jsquyres_at_[hidden]
> . mailto:mtt-devel_at_[hidden]
> . http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel

> _______________________________________________
> mtt-devel mailing list
> mtt-devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel