I`m not sure if we should preserve current DB schema, from one simple reason
- datastore is an object oriented storage and have different rules and
techniques then rdbms.
The basic storage unit in the datastore is an object which can be saved,
loaded and queried.
(hadoop is based on the same principles, but open source.)
It seems that DB model for mtt over datastore should not be complex at all.
The current mtt db schema is mostly optimized for specific queries dictated
by web UI. Datastore creates indexes automatically, based on submitted
I suggest we discuss/exchange db layout proposals by emails and when we get
to some general understanding how it should look like - we switch to
Also, It seems not problem at all to get datastore access for existing gmail
account. You get 500MB quota for storage. It takes 5min to start using it.
Here is some short info for datastore API:
- howto submit data model to datastore
- howto save, load, query
On Fri, Mar 20, 2009 at 5:38 PM, Jeff Squyres <jsquyres_at_[hidden]> wrote:
> On Mar 20, 2009, at 10:42 AM, Josh Hursey wrote:
> Yeah I think this sounds like a good way to move forward with this
>> work. The database schema is pretty complex. If you need help on the
>> database side of things let me know.
>> To get started, would it be useful to have a meeting over the phone/
>> telepresence to design the datastore layout? This gives us an
>> opportunity to start from a blank slate with regards to the
>> datastore, so it may be useful brainstorm a bit beforehand.
> Yes, it probably would. My understanding of hadoop (which is very
> highlevel) is that just dump everything in without too much concern about
> the structure / "schema". But I could be wrong on that.
> The Google Apps account is under my personal Google account, so I'm
>> reluctant to use it. I think the reason it took so long for me, was
>> because when I originally signed up it was in limited beta. I think
>> the approval time is much shorter now (maybe a day?), and we can make
>> an openmpi or mtt account that we can use.
>> With regard to Hadoop, I don't think that IU has a set of machines
>> that would work, but I can ask around. We could always try Hadoop on
>> a single machine if people wanted to play around with data querying/
>> I don't have a strong preference either way, but Google Apps may
>> provide us with a lower overhead solution for the long run even
>> though it costs $$.
> It looks like there is a set that you can use for free. When you go over
> one of several metrics (CPU hours/day, storage, bandwidth in, bandwidth out,
> etc.), then you have to start paying. But even with that, the costs look
> *quite* reasonable and should be easily covered by the combined Open MPI
> organizations (I'm talking hundreds of dollars here, not tens of thousands).
> Jeff Squyres
> Cisco Systems
> mtt-devel mailing list