Open MPI logo

MTT Devel Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all MTT Users mailing list

From: Andrew Friedley (afriedle_at_[hidden])
Date: 2006-06-29 13:06:02


Jeff Squyres (jsquyres) wrote:
>>-----Original Message-----
>>From: mtt-users-bounces_at_[hidden]
>>[mailto:mtt-users-bounces_at_[hidden]] On Behalf Of Andrew Friedley
>>Sent: Thursday, June 29, 2006 11:29 AM
>>To: General user list for the MPI Testing Tool
>>Subject: Re: [MTT users] Test output to perfbase
>>
>>
>>>I'll change my question to: can you send an example of the
>>
>>format that
>>
>>>perfbase is expecting for submitting the data running
>>
>>multiple tests in
>>
>>>a single http post to perfbase.php? E.g., say I have the results of
>>>running all the intel tests -- what is the format that you are
>>>expecting?
>>
>>I'll try to make some sample data tonight. I need to review how line
>>separators work with perfbase first.
>
>
> Thanks.
>
>
>>>Another question -- how exactly are you categorizing these
>>
>>results on
>>
>>>the server? You made mention of "runs" below -- from your
>>
>>context I'm
>>
>>>assuming that "run" has some specific meaning to perfbase,
>>
>>especially
>>
>>>with the categorization of output data.
>>
>>For efficiency, one run is one run of a test suite. What
>
>
> Which efficiency? Uploads? Database storage? Querying?

Primarily database - each run is stored in postgres as a table in a
database. Fields that vary are stored as rows in the database - one row
  has all the varying fields i.e. each field is a column. I think
non-varying fields are stored once as a row in a special, separate table.

> In a conversation with Sun, it turns out that we both want to have the
> ability to see partial results (e.g., running the entire Intel suite may
> take many hours -- it would be good to be able to see results
> more-or-less as they occur). Is there a technical issue that would
> prevent submitting 1 (or small batches of) result(s) at a time?

I think we're getting into the realm of 'too much' here. Both the
current design and especially your new proposed design are batch
oriented, not stream oriented. Heck, MTT in general.

This is doable, but when we start hammering/scaling this system, getting
as much information as possible in a perfbase run is going to be very
important. I remember Brian agreeing with me, tens/hundreds of
thousands of tables in postgres is a bad idea.

>
>>field is the
>>BTL selection reported in? If it 'occurs once' in the XML,
>>it is stored
>>per test suite (run), if it 'occurs many', that is a per-test
>>field and
>>can vary in a single run (i.e. tcp and openib would go together).
>
>
> Ok, good. So this would help with the database storage required for
> test results -- many of the fields will be the same for each of the
> individual tests in the intel test suite (for example).

Right - part of this field matching review might involve making sure we
have the right fields marked as constant/varying over a test suite run.

>
>>>1. send all results from the above intel run in a single
>>
>>submit (i.e.,
>>
>>>all tcp and all openib results). Since we submit the MCA
>>
>>params as part
>>
>>>of the data, our queries later can distinguish tcp vs. openib data.
>>
>>This is what I want. We can easily construct queries to only
>
>
> So I guess I'm still not clear on *why* you want this. :-) Can you
> specify the reasons?

Well, I think having a test suite run with all its variations
interpreted as a single perfbase run makes sense. We could certainly
draw the line elsewhere, but I think its appropriate that a test suite
run with a particular mpi install on a particular system makes a
suitable base unit. It matches both the MTT and perfbase architectures
well - we can support this easily in MTT, it scales well in perfbase,
doesn't compromise our query ability, and just plain gets the job done.

>
>>get what
>>we want i.e. just tcp, just threaded, just odin, etc.
>>
>>
>>>2. send all the tcp results in one submit and then send the
>>
>>openib data
>>
>>>in a separate submit.
>>
>>Bleh - so this works if you only consider the 'btl'
>>parameter. But what
>>ito consider other parameters in this way?
>
>
> Yes, that information (tcp vs. openib) is in one of the fields that we
> send back (it has to be, otherwise the results are somewhat
> meaningless). It's not a standalone "btl" field, though -- it's more of
> a "here's the MCA parameters that were specified" field. So queries for
> tcp results will probably need to search for "tcp" in the MCA parameters
> field.
>
> But this is the same issue regardless of whether we submit 1 result at a
> time or all at once, right? I guess I don't see the difference for
> selecting "tcp" vs. "openib" results based on whether we submit 1 result
> at a time or all at once -- can you clarify? I think I must be missing
> something...

I think you missed what I was saying - picking which BTL was used for
any kind of storage differentiation just seems completely arbitrary to
me. Not only that, its open mpi specific, or do we not care about being
MPI agnostic any more?

Though, if your new server side idea is implemented, it doesnt really
matter how you split stuff to send over HTTP, as it all gets thrown back
together before going into perfbase.

You're right in that it doesn't have much/anything to do with the split
submission issue.

> If this is all possible, then -- at least in my mind -- I don't see a
> reason why multiple submits vs. a single submit is *required*.
> Obviously, multiple submits will take more bandwidth than a single
> submit -- but I see that as an optimization that we can [and should]
> work out later. Specifically: reducing the bandwidth of submits doesn't
> need to be in the initial version since our primary, immediate goal is
> to get this functional, running nightly tests, and sending out test
> results in the morning as long as the current, unoptimized bandwidth
> requirements are not too onerous on milliways.

Well, the direction I was going with the server side was that a test
suite run would be in one HTTP POST. As far as I'm concerned its a
matter of writing code to do it differently, and how soon you want this
to work.

Andrew