From: Andrew Friedley (afriedle_at_[hidden])
Date: 2006-06-29 11:28:52


Jeff Squyres (jsquyres) wrote:
>>-----Original Message-----
>>From: mtt-users-bounces_at_[hidden]
>>[mailto:mtt-users-bounces_at_[hidden]] On Behalf Of Andrew Friedley
>>Sent: Thursday, June 29, 2006 1:10 AM
>>To: General user list for the MPI Testing Tool
>>Subject: Re: [MTT users] Test output to perfbase
>>
>>
>>>- I know you sent one before, but I've made a mess of managing huge
>>>volumes of mail with Outlook, and I can't find the example
>>
>>you sent of
>>
>>>the format that perfbase is expecting for all the test run
>>
>>results. Can
>>
>>>you send the example again? This is the
>>
>>send-all-the-results-at-once
>>
>>>vs. send-one-result-at-a-time issue.
>>
>>I can't find it, is this recent or from a long time ago?
>
>
> Ok, so maybe I'm didn't lose as much mail as I thought. :-)
>
> I'll change my question to: can you send an example of the format that
> perfbase is expecting for submitting the data running multiple tests in
> a single http post to perfbase.php? E.g., say I have the results of
> running all the intel tests -- what is the format that you are
> expecting?

I'll try to make some sample data tonight. I need to review how line
separators work with perfbase first.

> Another question -- how exactly are you categorizing these results on
> the server? You made mention of "runs" below -- from your context I'm
> assuming that "run" has some specific meaning to perfbase, especially
> with the categorization of output data.

For efficiency, one run is one run of a test suite. What field is the
BTL selection reported in? If it 'occurs once' in the XML, it is stored
per test suite (run), if it 'occurs many', that is a per-test field and
can vary in a single run (i.e. tcp and openib would go together).

> For example, let's say I have the following exec in my MPI Details
> section:
>
> exec = mpirun --mca &enumerate("tcp","openib"),self -np &test_np() \
> &test_executable() &test_argv()
>
> This will generate two mpirun command lines for every test executable
> (one for tcp,self and one for openib,self). Would you consider the tcp
> outputs a different "run" than the openib outputs?
>
> Regardless of the terminology, I eventually want to be able to query for
> tcp run data separately from the openib run data. I'm assuming that
> this has a big impact on how we submit data to perfbase. I see two
> possibilities offhand:
>
> 1. send all results from the above intel run in a single submit (i.e.,
> all tcp and all openib results). Since we submit the MCA params as part
> of the data, our queries later can distinguish tcp vs. openib data.

This is what I want. We can easily construct queries to only get what
we want i.e. just tcp, just threaded, just odin, etc.

> 2. send all the tcp results in one submit and then send the openib data
> in a separate submit.

Bleh - so this works if you only consider the 'btl' parameter. But what
ito consider other parameters in this way?

> I assume that there are storage and/or query efficiency issues with this
> decision. We can do either of these groupings in the client -- which
> should we use?
>
>
>>>- Apache is going to have a max upload size (2MB? I always
>>
>>forget what
>>
>>>milliways is configured for). The client will need to be
>>
>>able to split
>>
>>>up results that are larger than this. This has two implications:
>>>
>>>1. If a single report is larger than this size (e.g., if
>>
>>the output from
>>
>>>an "install MPI" phase is larger than the max upload size),
>>
>>Badness will
>>
>>>ensure. I don't quite know how to handle this.
>>
>>Make the max upload size bigger :)
>
>
> More on this below.
>
>
>>>2. In the test run phase, given that we're switching to
>>
>>sending all the
>>
>>>output from each test individually to sending them all at
>>
>>once, if the
>>
>>>total size of a given report is larger than the max limit, can I
>>>arbitrarily split up the output results and send them in <max_size
>>>chunks? (preserving header/data groupings, of course) I'm assuming
>>>yes, but just wanted to make sure.
>>
>>Not sure. I think this will work, but it should be avoided.
>>With how
>>things are now, this is going to generate a run per
>>submission, and the
>>whole point of sending test results all at once is to
>>consolidate them
>>into one run. Not the best, but not the end of the world either.
>
>
> I'm not sure I follow this (again, I think you have some special meaning
> for the word "run" -- can you explain?). What are the benefits /
> drawbacks of sending in a single submit vs. multiple submits?
>
>
>>Though I think it's possible for us to collect several files on the
>>server side before submitting them to perfbase as one single run.
>>Problem is, that's going to be a lot of work on the server side.
>
>
> Depends on how it's done. It doesn't have to be too much work or too
> complicated.
>
>
>>Depending on how the apache limit is enforced, this would involve
>>writing files to storage or preserving the data between HTTP POSTs in
>>some way.
>
>
> Apache enforces the limit by effectively terminating the input (it might
> even close the socket? I don't remember offhand). Suffice it to say
> that whatever the mechanism, we can't go over the limit.
>
> I have some thoughts on this, but I'll start another thread for it.
>