On Tue, Feb 21, 2012 at 05:30:20PM -0500, Rayson Ho wrote:
> On Tue, Feb 21, 2012 at 12:06 PM, Rob Latham <robl_at_[hidden]> wrote:
> > ROMIO's testing and performance regression framework is honestly a
> > shambles. Part of that is a challenge with the MPI-IO interface
> > itself. For MPI messaging you exercise the API and you have pretty
> > much covered everything. MPI-IO, though, introduces hints. These
> > hints are great for tuning but make the testing "surface area" a lot
> > larger. We are probably going to have a chance to improve things
> > greatly with some recently funded proposals.
> Thanks for the replies Rob.
> I am interested in testing mainly because not a lot of projects have
> spare clusters lying around for performance regression testing. But
> then these days we can get machines from EC2 easily & relatively
> cheaply, so I was wondering if other projects are migrating their test
> infrastructure to EC2.
The good news is it is no longer 2001: folks can round up tens of
nodes, and with threading and oversubscribing start to exercise
hundreds of MPI processors. EC2, division clusters, and even a
desk-side machine might suffice for these scales.
The real challenge is how to do testing and research at O(100,000) mpi
processors. I have the good fortune to have access to Intrepid at
Argonne. I know access to these large machines can be somewhat hard
to come by.
Mathematics and Computer Science Division
Argonne National Lab, IL USA