Subject: Re: [MTT users] [MTT bugs] [MTT] #212: Generic network lockingserver
From: Ethan Mallove (ethan.mallove_at_[hidden])
Date: 2009-11-18 13:25:49


On Sat, Nov/07/2009 04:15:42PM, Jeff Squyres wrote:
> On Nov 6, 2009, at 5:18 PM, Ethan Mallove wrote:
>
>> I'm running multiple MTT clients out of the same scratch directory
>> using SGE. I'm running into race conditions between the multiple
>> clients, where one client is overwriting another's data in the .dump
>> files - which is a Very Bad Thing(tm). I'm running the
>> client/mtt-lock-server, and I've added the corresponding [Lock]
>> section in my INI file. Will my MTT clients now not interfere with
>> each other's .dump files? I'm skeptical of this because I don't see,
>> e.g., Lock() calls in SaveRuns(). How do I make my .dump files safe?
>>
>
>
> Err... perhaps this part wasn't tested well...?
>
> I'm afraid it's been forever since I've looked at this code and I'm gearing
> up to leave for the Forum on Tuesday and then staying on for SC09, so it's
> quite likely that you'll be able to look at this in more detail before I
> will. Sorry to pass the buck; just trying to be realistic... :-(

After some digging, I discover that MTT is not designed to execute
multiple INI sections out of a single scratch directory in parallel.
There's a ticket for this:

  https://svn.open-mpi.org/trac/mtt/ticket/167

The way around this limitation is to have MTT split up the .dump files
by INI section so that two MTT client running simultaneously never
conflict with each other. (This change did not need to be made for the
Test run .dump files, as MTT already splits them up.) I have attached
a patch, which makes a simple wrapper script for #167 possible. The
changes should not disrupt normal (non-parallel) execution. Anyone
care to give it a try?

-Ethan

>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
>
> _______________________________________________
> mtt-users mailing list
> mtt-users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users