Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] Setting AUTOMAKE_JOBS
From: Paul H. Hargrove (PHHargrove_at_[hidden])
Date: 2010-09-24 23:08:55


I don't feel as strongly about this as Ralph, but do think the new
behavior violates the "principle of least surprise".

-Paul

Ralph Castain wrote:
> Been thinking about this more today, and I actually find this new
> "feature" disturbing. It bothers me that OMPI is now dictating that it
> will do a parallel build without my knowledge unless I specifically
> tell it not to. If it were technically possible, would we next force
> "make -j4"?? How would the developer community feel if the authors of
> "make" suddenly decided that it would run 4 parallel threads under the
> covers unless you specifically told it not to?
>
> What bugs me here is that I now have to remember to set something in
> my environment to tell OMPI "you don't get to hog all my processors".
> Maybe others twiddle their thumbs and leave the computer alone while
> OMPI builds, or maybe they rarely build - but I build frequently, and
> I am always multi-tasking my time (running Word, Powerpoint, etc.). So
> having OMPI default to running a parallel build is more than a little
> annoying - frankly, it pisses me off.
>
> I really feel that this "feature" should be implemented as an option
> passed to autogen instead of a hidden forced behavior. If someone
> wants to run a parallel build, then by all means let them ask for it
> (ala "make -j4"). But don't just -do- it.
>
> Grrrr....
> Ralph
>
>
> On Fri, Sep 24, 2010 at 7:28 AM, Ralph Castain <rhc_at_[hidden]
> <mailto:rhc_at_[hidden]>> wrote:
>
> I hope you'll understand if I don't run that test while on the
> road...one battery yank per week is my limit :-)
>
>
> On Fri, Sep 24, 2010 at 4:40 AM, Jeff Squyres (jsquyres)
> <jsquyres_at_[hidden] <mailto:jsquyres_at_[hidden]>> wrote:
>
> Also to clarify:
>
> - did autogen set am-jobs to 2 in your case? (it should do
> that if lstopo is not found - it also limits itself to 4 at max)
>
> - in the same scenario, what happens if you manually set
> am-jobs to 1 and run autogen? Ie do you get the same
> heat/sluggishness? I have experienced vms causing this kind
> of behavior just because they are running - causing CPU and
> memory pressure.
>
> Sent from my PDA. No type good.
>
> On Sep 24, 2010, at 12:49 AM, "Ralph Castain"
> <rhc_at_[hidden] <mailto:rhc_at_[hidden]>> wrote:
>
>> Sent to both for reference (see below)
>>
>> Just to clarify. It wasn't a deadlock situation, but rather
>> that the machine was overloaded and running so hard that the
>> response to keystrokes was multiple seconds. Thus, there was
>> no way to shut it down from the keyboard or screen. Even a
>> ctrl-c was just getting ignored for a very long time due to
>> the overload.
>>
>> I was running vmware on my machine, and doing a heavy
>> compile/build in it. On top of this, I had email, editor, and
>> browsers running - and then kicked off a fresh build in a
>> terminal window. With Jeff's default settings, this latter
>> build thought it would be running alone on the machine, and
>> promptly generated a number of threads equal to all the
>> processors. Since they were already loaded, this drove the
>> machine into the ground.
>>
>> My point is just that it is unwise to assume that the OMPI
>> build can utilize all available processors. I'm sure it's
>> fine for the MTT runs, especially on Jeff's machines as they
>> are dedicated to that purpose - just not a good general
>> assumption.
>>
>>
>> HTH
>> Ralph
>>
>> ====================================
>> Output of "perl -V":
>>
>> Summary of my perl5 (revision 5 version 8 subversion 9)
>> configuration:
>> Platform:
>> osname=darwin, osvers=10.2.0, archname=darwin-2level
>> uname='darwin sjc-rcastain-87111.cisco.com
>> <http://sjc-rcastain-87111.cisco.com> 10.2.0 darwin kernel
>> version 10.2.0: tue nov 3 10:37:10 pst 2009;
>> root:xnu-1486.2.11~1release_i386 i386 '
>> config_args='-des -D prefix=/opt/local -D
>> scriptdir=/opt/local/bin -D cppflags=-I/opt/local/include -D
>> ccflags=-O2 -arch x86_64 -D ldflags=-L/opt/local/lib -D
>> vendorprefix=/opt/local -D man1ext=1pm -D man3ext=3pm -D
>> cc=/usr/bin/gcc-4.2 -D ld=/usr/bin/gcc-4.2 -D
>> man1dir=/opt/local/share/man/man1p -D
>> man3dir=/opt/local/share/man/man3p -D
>> siteman1dir=/opt/local/share/man/man1 -D
>> siteman3dir=/opt/local/share/man/man3 -D
>> vendorman1dir=/opt/local/share/man/man1 -D
>> vendorman3dir=/opt/local/share/man/man3 -D
>> inc_version_list=5.8.8 5.8.8/darwin-2level -U i_bind -U
>> i_gdbm -U i_db'
>> hint=recommended, useposix=true, d_sigaction=define
>> usethreads=undef use5005threads=undef useithreads=undef
>> usemultiplicity=undef
>> useperlio=define d_sfio=undef uselargefiles=define
>> usesocks=undef
>> use64bitint=define use64bitall=define uselongdouble=undef
>> usemymalloc=n, bincompat5005=undef
>> Compiler:
>> cc='/usr/bin/gcc-4.2', ccflags ='-O2 -arch x86_64
>> -fno-common -DPERL_DARWIN -I/opt/local/include
>> -no-cpp-precomp -fno-strict-aliasing -pipe
>> -I/usr/local/include -I/opt/local/include',
>> optimize='-O3',
>> cppflags='-I/opt/local/include -no-cpp-precomp -O2 -arch
>> x86_64 -fno-common -DPERL_DARWIN -I/opt/local/include
>> -no-cpp-precomp -fno-strict-aliasing -pipe
>> -I/usr/local/include -I/opt/local/include'
>> ccversion='', gccversion='4.2.1 (Apple Inc. build 5646)
>> (dot 1)', gccosandvers=''
>> intsize=4, longsize=8, ptrsize=8, doublesize=8,
>> byteorder=12345678
>> d_longlong=define, longlongsize=8, d_longdbl=define,
>> longdblsize=16
>> ivtype='long', ivsize=8, nvtype='double', nvsize=8,
>> Off_t='off_t', lseeksize=8
>> alignbytes=8, prototype=define
>> Linker and Libraries:
>> ld='env MACOSX_DEPLOYMENT_TARGET=10.3 /usr/bin/gcc-4.2',
>> ldflags ='-L/opt/local/lib -L/usr/local/lib'
>> libpth=/usr/local/lib /opt/local/lib /usr/lib
>> libs=-ldbm -ldl -lm -lutil -lc
>> perllibs=-ldl -lm -lutil -lc
>> libc=/usr/lib/libc.dylib, so=dylib, useshrplib=false,
>> libperl=libperl.a
>> gnulibc_version=''
>> Dynamic Linking:
>> dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef,
>> ccdlflags=' '
>> cccdlflags=' ', lddlflags='-L/opt/local/lib -bundle
>> -undefined dynamic_lookup -L/usr/local/lib'
>>
>>
>> Characteristics of this binary (from libperl):
>> Compile-time options: PERL_MALLOC_WRAP USE_64_BIT_ALL
>> USE_64_BIT_INT
>> USE_FAST_STDIO USE_LARGE_FILES USE_PERLIO
>> Built under darwin
>> Compiled at Feb 13 2010 13:19:33
>> @INC:
>> /opt/local/lib/perl5/site_perl/5.8.9/darwin-2level
>> /opt/local/lib/perl5/site_perl/5.8.9
>> /opt/local/lib/perl5/site_perl
>> /opt/local/lib/perl5/vendor_perl/5.8.9/darwin-2level
>> /opt/local/lib/perl5/vendor_perl/5.8.9
>> /opt/local/lib/perl5/vendor_perl
>> /opt/local/lib/perl5/5.8.9/darwin-2level
>> /opt/local/lib/perl5/5.8.9
>> .
>>
>> On Thu, Sep 23, 2010 at 10:26 PM, Ralf Wildenhues
>> <Ralf.Wildenhues_at_[hidden] <mailto:Ralf.Wildenhues_at_[hidden]>> wrote:
>>
>> Hello Ralph,
>>
>> wow, that's not good to hear. I knew the perl ithreads
>> implementation
>> wasn't all that efficient, but causing a deadlock sounds
>> like you have
>> more trouble than just perl; at least I hope so. For
>> reference, can
>> you send 'perl -V' output (if you like, to the
>> bug-automake at gnu.org <http://gnu.org>
>> list).
>>
>> Thanks,
>> Ralf
>>
>> * Ralph Castain wrote on Fri, Sep 24, 2010 at 03:12:16AM
>> CEST:
>> > I found one major negative to this change - it assumes
>> that my build is
>> > being done in exclusion of anything else on my
>> computer. Unfortunately, this
>> > is never true.
>> >
>> > So my laptop hemorrhaged itself into frozen silence,
>> overheated to the point
>> > of being burning hot, and had to have its battery
>> yanked to stop the runaway
>> > behavior. Not a really good thing.
>> >
>> > I would suggest you default this "heuristic" out, and
>> let someone set it to
>> > use multiple runs if-and-only-if they want it. Hate to
>> cite the lowest
>> > common denominator, but this was a very nasty surprise.
>> >
>> >
>> >
>> > On Wed, Sep 22, 2010 at 7:50 AM, Jeff Squyres
>> <jsquyres_at_[hidden] <mailto:jsquyres_at_[hidden]>> wrote:
>> >
>> > > Some of you may be unaware that recent versions of
>> automake can run in
>> > > parallel. That is, automake will run in parallel
>> with a degree of (at most)
>> > > $AUTOMAKE_JOBS. This can speed up the execution time
>> of autogen.pl <http://autogen.pl> quite
>> > > a bit on some platforms. On my cluster at cisco,
>> here's a few quick timings
>> > > of the entire autogen.pl <http://autogen.pl> process
>> (of which, automake is the bottleneck):
>> > >
>> > > $AUTOMAKE_JOBS Total wall time
>> > > value of autogen.pl
>> <http://autogen.pl>
>> > > 8 3:01.46
>> > > 4 2:55.57
>> > > 2 3:28.09
>> > > 1 4:38.44
>> > >
>> > > This is an older Xeon machine with 2 sockets, each
>> with 2 cores.
>> > >
>> > > There's a nice performance jump from 1 to 2, and a
>> smaller jump from 2 to
>> > > 4. 4 and 8 are close enough to not matter. YMMV.
>> > >
>> > > I just committed a heuristic to autogen.pl
>> <http://autogen.pl> to setenv AUTOMAKE_JOBS if it
>> > > is not already set
>> (https://svn.open-mpi.org/trac/ompi/changeset/23788):
>> > >
>> > > - If lstopo is found in your $PATH, runs it and count
>> how many PU's
>> > > (processing units) you have. It'll set AUTOMAKE_JOBS
>> to that number, or a
>> > > maximum of 4 (which is admittedly a further heuristic).
>> > > - If lstopo is not found, it just sets AUTOMAKE_JOBS
>> to 2.
>> > >
>> > > Enjoy.
>> > >
>> > > --
>> > > Jeff Squyres
>> > > jsquyres_at_[hidden] <mailto:jsquyres_at_[hidden]>
>> > > For corporate legal information go to:
>> > > http://www.cisco.com/web/about/doing_business/legal/cri/
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden] <mailto:devel_at_[hidden]>
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden] <mailto:devel_at_[hidden]>
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden] <mailto:devel_at_[hidden]>
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

-- 
Paul H. Hargrove                          PHHargrove_at_[hidden]
Future Technologies Group
HPC Research Department                   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900