On Mar 20, 2009, at 10:42 AM, Josh Hursey wrote:Yes, it probably would. My understanding of hadoop (which is very highlevel) is that just dump everything in without too much concern about the structure / "schema". But I could be wrong on that.
Yeah I think this sounds like a good way to move forward with this
work. The database schema is pretty complex. If you need help on the
database side of things let me know.
To get started, would it be useful to have a meeting over the phone/
telepresence to design the datastore layout? This gives us an
opportunity to start from a blank slate with regards to the
datastore, so it may be useful brainstorm a bit beforehand.
It looks like there is a set that you can use for free. When you go over one of several metrics (CPU hours/day, storage, bandwidth in, bandwidth out, etc.), then you have to start paying. But even with that, the costs look *quite* reasonable and should be easily covered by the combined Open MPI organizations (I'm talking hundreds of dollars here, not tens of thousands).
The Google Apps account is under my personal Google account, so I'm
reluctant to use it. I think the reason it took so long for me, was
because when I originally signed up it was in limited beta. I think
the approval time is much shorter now (maybe a day?), and we can make
an openmpi or mtt account that we can use.
With regard to Hadoop, I don't think that IU has a set of machines
that would work, but I can ask around. We could always try Hadoop on
a single machine if people wanted to play around with data querying/
I don't have a strong preference either way, but Google Apps may
provide us with a lower overhead solution for the long run even
though it costs $$.
mtt-devel mailing list