Open MPI logo

MTT Devel Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all MTT Users mailing list

From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2006-10-05 12:26:56

On 10/5/06 11:28 AM, "Ethan Mallove" <ethan.mallove_at_[hidden]> wrote:

>> Some of these are the usual "IBM isn't being recorded
>> properly". But a whole bunch of them are wrong ELF
>> classes reported by Sun builds -- is this a symptom of
>> sharing scratch directories?
> Yes, 'wrong ELF classes' is a SPARC binary run on i386 or
> vice versa.

>> I suspect that you need to whack your current scratch dir
>> and then start tonight with at least 2 new scratch
>> directories. More specifically, you need a scratch
>> directory for each MTT invocation that you run.
> So this also underlines the importance of the Trim phase
> that I was not seeing, because it means that if, e.g.,
> 1.2a1r11852 is still in the scratch tree - it will get run
> every night if it's not removed. So we can rm -rf the
> whole tree, but then we can't go back and rerun what we're
> interested in. Right?

No, it *shouldn't* be runing every night (in my tests, things are not run
multiple times, but there is some current bug about this because Terry is
seeing things run multiple times). But yes, this underscores the need for
the trim phase.
> I think I also may have something for the wishlist. It would
> be awesome if there was an option to tell mtt to only run
> what is in the ini file. That would eliminate the need for
> multiple scratch trees (which must be set manually). So a
> --non-recursive (?) option or something, that tells mtt to
> not walk the scratch directory and run everything it finds.
> Or does this defeat the purpose of something I'm not seeing?

It does only run what's in the INI file. But if you have a) two MTT's
sharing a common scratch, unpredictable things can happen with the XML meta
data files in there (i.e., I don't do anything to guarantee atomic access
and updates, etc.). Or b) if you have older versions of stuff in your INI
file that somehow are not accounted for properly in the XML data files, it
could try to run them again.

Jeff Squyres
Server Virtualization Business Unit
Cisco Systems