Yes, I think you're right -- making a "schema" for the datastore might
be quite easy. I'm on travel all this week and likely won't be able
to look into this stuff -- can you guys post a proposal and we can
dive into it from that angle?
On Mar 22, 2009, at 6:48 AM, Mike Dubman wrote:
> Hello guys,
> I`m not sure if we should preserve current DB schema, from one
> simple reason - datastore is an object oriented storage and have
> different rules and techniques then rdbms.
> The basic storage unit in the datastore is an object which can be
> saved, loaded and queried.
> (hadoop is based on the same principles, but open source.)
> It seems that DB model for mtt over datastore should not be complex
> at all. The current mtt db schema is mostly optimized for specific
> queries dictated by web UI. Datastore creates indexes automatically,
> based on submitted queries history.
> I suggest we discuss/exchange db layout proposals by emails and when
> we get to some general understanding how it should look like - we
> switch to telepresence.
> Also, It seems not problem at all to get datastore access for
> existing gmail account. You get 500MB quota for storage. It takes
> 5min to start using it.
> Here is some short info for datastore API:
> - howto submit data model to datastore
> - howto save, load, query
> please comment.
> On Fri, Mar 20, 2009 at 5:38 PM, Jeff Squyres <jsquyres_at_[hidden]>
> On Mar 20, 2009, at 10:42 AM, Josh Hursey wrote:
> Yeah I think this sounds like a good way to move forward with this
> work. The database schema is pretty complex. If you need help on the
> database side of things let me know.
> To get started, would it be useful to have a meeting over the phone/
> telepresence to design the datastore layout? This gives us an
> opportunity to start from a blank slate with regards to the
> datastore, so it may be useful brainstorm a bit beforehand.
> Yes, it probably would. My understanding of hadoop (which is very
> highlevel) is that just dump everything in without too much concern
> about the structure / "schema". But I could be wrong on that.
> The Google Apps account is under my personal Google account, so I'm
> reluctant to use it. I think the reason it took so long for me, was
> because when I originally signed up it was in limited beta. I think
> the approval time is much shorter now (maybe a day?), and we can make
> an openmpi or mtt account that we can use.
> With regard to Hadoop, I don't think that IU has a set of machines
> that would work, but I can ask around. We could always try Hadoop on
> a single machine if people wanted to play around with data querying/
> I don't have a strong preference either way, but Google Apps may
> provide us with a lower overhead solution for the long run even
> though it costs $$.
> It looks like there is a set that you can use for free. When you go
> over one of several metrics (CPU hours/day, storage, bandwidth in,
> bandwidth out, etc.), then you have to start paying. But even with
> that, the costs look *quite* reasonable and should be easily covered
> by the combined Open MPI organizations (I'm talking hundreds of
> dollars here, not tens of thousands).
> Jeff Squyres
> Cisco Systems
> mtt-devel mailing list
> mtt-devel mailing list