Open MPI logo

MTT Users Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [MTT users] [MTT bugs] [MTT] #212: Generic networklockingserver *REVIEW NEEDED*
From: Ethan Mallove (ethan.mallove_at_[hidden])
Date: 2010-02-19 12:00:55

On Thu, Feb/18/2010 04:13:15PM, Jeff Squyres wrote:
> On Feb 18, 2010, at 10:48 AM, Ethan Mallove wrote:
> > To ensure there is never a collision between $a->{k} and $b->{k}, the
> > user can have two MTT clients share a $scratch, but they cannot both
> > run the same INI section simultaneously. I setup my scheduler to run
> > batches of MPI get, MPI install, Test get, Test build, and Test run
> > sections in parallel with successor INI sections dependent on their
> > predecessor INI sections (e.g., [Test run: foo] only runs after [Test
> > build: foo] completes). The limitation stinks, but the current
> > limitation is much worse: two MTT clients can't even run the same
> > *phase* out of one $scratch.
> Maybe it might be a little nicer just to protect the user from
> themselves -- if we ever detect a case where $a->{k} and $b->{k}
> both exist and are not the same value, dump out everything to a file
> and abort with an error message. This is clearly an erroneous
> situation, but running MTT in big parallel batches like this is a
> worthwhile-but-complicated endeavor, and some people are likely to
> get it wrong. So we should at least detect the situation and fail
> gracefully, rather than losing or corrupting results.
> Make sense?

Yes. I'll add this.


> > I originally wanted the .dump files to be completely safe, but MTT
> > clients were getting locked out of the .dump files for way too long.
> > E.g., MTT::MPI::LoadInstalls happens very early in client/mtt, and an
> > hour could elapse before MTT::MPI::SaveInstalls is called in
> >
> Yep, if you lock from load->save, then that can definitely happen...
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to:
> _______________________________________________
> mtt-users mailing list
> mtt-users_at_[hidden]