On Jan 11, 2008, at 1:29 PM, Ethan Mallove wrote:
> On Fri, Jan/11/2008 12:49:50PM, Jeff Squyres wrote:
>> On Jan 10, 2008, at 10:29 AM, Josh Hursey wrote:
>>> Since we are ramping up to a v1.3 release we want to visualization
>>> support this effort. So we want to make sure that the visualization
>>> will meet the development community's needs. We should probably ask
>>> the devel-core list, but I thought I would start some of the
>>> discussion here to make sure I am asking the right questions of the
>> Sounds reasonable.
>> After a first go-round here, we might want to have a conversation
>> the OMPI RM's to get their input - that would still be a small group
>> to get targeted feedback on these questions.
This sounds good to me.
>>> To start I have some basic questions:
>>> - How does Open MPI determine that it is stable enough to release?
>> I personally have a Magic 8 Ball on my desk that I consult frequently
>> for questions like this. ;-)
Does it have an OMPI sticker on it? :)
>> It's a mix of many different metrics, actually:
>> - stuff unrelated to MTT results:
>> - how many trac tickets are open against that release and do we
>> - how urgent are the bug fixes that are included
>> - external requirements (e.g., get an OMPI release out to meet the
>> OFED release schedule)
>> - ...and probably others
I realize that this is just to complete the list, but we may be able
to (one day in the distant future) link some of the Trac tickets with
MTT testing. This would allow us, for example, to have a link from a
Trac ticket to a special MTT reporter page that show how well testing
for this bug is going, and who is testing it (or working on it). Just
something to kick around, but it might be neat if MTT and Trac could
play better together one day.
>> - related to MTT results
>> - "good" coverage on platforms (where "platform" = host arch, OS,
>> OS version, compiler, compiler version, MCA params, interconnect, and
>> scheduler -- note that some of these are orthogonal from each
I think this is the one we are going to focus on in this first pass.
>> - the only failures and timeouts we have are a) repeatable, b)
>> consistent across multiple organizations (if relevant), and deemed to
>> be acceptable
We might be able to help highlight this situation. I'll have to think
about it a bit more.
>>> - What dimensions of testing are most/least important (i.e.,
>>> platforms, compilers, feature sets, scale, ...)?
>> This is a hard question. :-\ I listed several dimensions above:
>> - host architecture
>> - OS
>> - OS version
>> - compiler
>> - compiler version
>> - MCA parameters used
>> - interconnect
>> - scheduler
>> Here's some more:
>> - number of processes tested
>> - layout of processes (by node, by proc, ...etc.)
>> I don't quite know how to order those in terms of priority. :-\
I think that for some of these characteristics it will be feature
dependent. We may end up with a few lists:
- General acceptance for all the normal chases and default feature
- A set of configurations that must pass for opt-in feature X
- A set of configurations that must pass for opt-in feature Y
Each list may have a different visualization associated with it. So we
can say that in the normal use case everything is fine, but when we
test with feature X then these N tests fail. Then we can determine if
feature X is important enough to delay release.
>>> - What other questions would be useful to answer with regard to
>>> testing (thinking completely outside of the box)?
>>> * Example: Are we testing a specific platform/configuration set
>>> too much/too little?
>> This is a great question.
>> I would love to be able to configure this question -- e.g., are we
>> testing some MCA params too much/too little.
This is the one question we have been talking about the most. With the
visualization that Joseph was talking with me about it seems like a
natural fit. It would help us to determine how to best organize our
testing efforts so we don't waist time over testing something while
under testing something else.
>> The performance stuff can always be visualized better, especially
>> time. One idea is expressed in https://svn.open-mpi.org/trac/mtt/ticket/330
>> I also very much like the ideas in https://svn.open-mpi.org/trac/mtt/ticket/236
>> and https://svn.open-mpi.org/trac/mtt/ticket/302 (302 is not
>> expressed as a visualization issue, but it could be -- you can
>> a tree-based display showing the relationships between phase results,
>> perhaps even incorporated with a timeline -- that would be awesome).
These are good ideas.
>> Here's a whacky idea -- can our MTT data be combined with SCM data
>> (SVN, in this case) to answer questions like:
>> - what parts of the code are the most troublesome? i.e., when this
>> part of the code changes, these tests tend to break
>> - what tests seem to be related to what parts of the OMPI code base?
>> - who / what SVN commit(s) seemed to cause specific tests to break?
>> (this seems like a longer-term set of questions, but I thought I'd
>> bring it up...)
> I like this idea :-)
> A level of redirection missing to do this is keying SVN r
> numbers to files modified. We also need to be able to
> somehow track *new* failures (see
> https://svn.open-mpi.org/trac/mtt/ticket/70). E.g., "was it
> *this* revision that broke test xyz or was it an older one?"
This is a neat idea, and certainly possible. This may be easier than
one would expect. I know Joseph has a fair amount of experience mining
similar Sourceforge data to answer some related questions, so he may
have some ideas here.
I CC'ed Joseph on this email so he can see some of the questions being
posed. Joseph feel free to subscribe to the mtt-devel list if you want
to. It is (I believe) just Ethan, Jeff, and myself and is fairly low
Keep the suggestions coming if you think of any more.
>> mtt-devel mailing list
> mtt-devel mailing list