Subject: Re: [MTT users] [MTT bugs] [MTT] #212: Generic networklockingserver *REVIEW NEEDED*
From: Ethan Mallove (ethan.mallove_at_[hidden])
Date: 2010-02-19 12:00:55


On Thu, Feb/18/2010 04:13:15PM, Jeff Squyres wrote:
> On Feb 18, 2010, at 10:48 AM, Ethan Mallove wrote:
>
> > To ensure there is never a collision between $a->{k} and $b->{k}, the
> > user can have two MTT clients share a $scratch, but they cannot both
> > run the same INI section simultaneously. I setup my scheduler to run
> > batches of MPI get, MPI install, Test get, Test build, and Test run
> > sections in parallel with successor INI sections dependent on their
> > predecessor INI sections (e.g., [Test run: foo] only runs after [Test
> > build: foo] completes). The limitation stinks, but the current
> > limitation is much worse: two MTT clients can't even run the same
> > *phase* out of one $scratch.
>
> Maybe it might be a little nicer just to protect the user from
> themselves -- if we ever detect a case where $a->{k} and $b->{k}
> both exist and are not the same value, dump out everything to a file
> and abort with an error message. This is clearly an erroneous
> situation, but running MTT in big parallel batches like this is a
> worthwhile-but-complicated endeavor, and some people are likely to
> get it wrong. So we should at least detect the situation and fail
> gracefully, rather than losing or corrupting results.
>
> Make sense?

Yes. I'll add this.

-Ethan

>
> > I originally wanted the .dump files to be completely safe, but MTT
> > clients were getting locked out of the .dump files for way too long.
> > E.g., MTT::MPI::LoadInstalls happens very early in client/mtt, and an
> > hour could elapse before MTT::MPI::SaveInstalls is called in
> > Install.pm.
>
> Yep, if you lock from load->save, then that can definitely happen...
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
>
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> mtt-users mailing list
> mtt-users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users