Open MPI logo

MTT Devel Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all MTT Users mailing list

Subject: Re: [MTT users] [MTT bugs] [MTT] #212: Generic networklocking server *REVIEW NEEDED*
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2010-02-17 16:57:38


Sorry for the delay...

I see the comments like this:

+ # We write the entire MPI::sources hash to file, even
+ # though the filename indicates a single INI section
+ # MTT::Util::hashes_merge will take care of duplicate
+ # hash keys. The reason for splitting up the .dump files
+ # is to keep them read and write safe across INI sections

I'm a little confused by this. I see that the goal is to have multiple MTT clients running simultaneously, all sharing a single $scratch. Per the comment above, you're writing all current data to the .dump file, even if it's more than just the one section that the parameters (and filename) implies. You're relying on merge_hashes() to "figure it out" and create one unified tree underneath.

I'm a bit worried: aren't there cases where you can end up with a conflict? I.e., hash A has value X for key K, but hash B has value B for the same key K?

On Feb 11, 2010, at 12:09 PM, Ethan Mallove wrote:

> This apparently got lost in the shuffle a few months ago. The fix
> allows one to kick off all of their MPI Installs and Test Builds in
> parallel. Give it a try when you have a chance.
>
> -Ethan
>
>
> > On Sat, Nov/07/2009 04:15:42PM, Jeff Squyres wrote:
> > > On Nov 6, 2009, at 5:18 PM, Ethan Mallove wrote:
> > >
> > >> I'm running multiple MTT clients out of the same scratch directory
> > >> using SGE. I'm running into race conditions between the multiple
> > >> clients, where one client is overwriting another's data in the .dump
> > >> files - which is a Very Bad Thing(tm). I'm running the
> > >> client/mtt-lock-server, and I've added the corresponding [Lock]
> > >> section in my INI file. Will my MTT clients now not interfere with
> > >> each other's .dump files? I'm skeptical of this because I don't see,
> > >> e.g., Lock() calls in SaveRuns(). How do I make my .dump files safe?
> > >>
> > >
> > >
> > > Err... perhaps this part wasn't tested well...?
> > >
> > > I'm afraid it's been forever since I've looked at this code and I'm gearing
> > > up to leave for the Forum on Tuesday and then staying on for SC09, so it's
> > > quite likely that you'll be able to look at this in more detail before I
> > > will. Sorry to pass the buck; just trying to be realistic... :-(
> >
> > After some digging, I discover that MTT is not designed to execute
> > multiple INI sections out of a single scratch directory in parallel.
> > There's a ticket for this:
> >
> > https://svn.open-mpi.org/trac/mtt/ticket/167
> >
> > The way around this limitation is to have MTT split up the .dump files
> > by INI section so that two MTT client running simultaneously never
> > conflict with each other. (This change did not need to be made for the
> > Test run .dump files, as MTT already splits them up.) I have attached
> > a patch, which makes a simple wrapper script for #167 possible. The
> > changes should not disrupt normal (non-parallel) execution. Anyone
> > care to give it a try?
> >
> > -Ethan
> >
> > >
> > > --
> > > Jeff Squyres
> > > jsquyres_at_[hidden]
> > >
> > > _______________________________________________
> > > mtt-users mailing list
> > > mtt-users_at_[hidden]
> > > http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
>
> <mtt-safe-dump-files.diff>_______________________________________________
> mtt-users mailing list
> mtt-users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
> _______________________________________________
> mtt-users mailing list
> mtt-users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/