Re: Thoughts on limiting the parallel build load, suggestion for a new "

bug-make

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Thoughts on limiting the parallel build load, suggestion for a new "

From:	Howard Chu
Subject:	Re: Thoughts on limiting the parallel build load, suggestion for a new "-j auto" option
Date:	Thu, 01 Mar 2012 10:44:55 -0800
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:12.0a1) Gecko/20111224 Firefox/12.0a1 SeaMonkey/2.9a1

Edward Welbourne wrote:

Go and shoot the moron who wrote that script.


Easy to say, but no use to someone who needs to deploy a package whose
build infrastructure includes a badly-written script.  If the author of
that script has been shot, there's probably no-one alive who understands
how to build the package.  Quite possibly, the author only knocked the
script together for personal use, but made it available to the project
as a start-point for others to set up their own builds, only to see it
chucked into the distribution because it was more use than nothing.

They could have committed the script with either totaly benign settings, orobviously broken settings that force the user to substitute in valid ones.(Since otherwise most users will ignore any documentation.)


E.g., changed to "make -j PICK_A_NUMBER" would have been better.

The make -j value should never be encoded in any file; it should only
ever be set by the user invoking the actual top level make command.


and therein lies the problem with -j and -l; they require the user to
have a sophisticated understanding of the system parameters and build
load profile in order to guess a half-way decent value for them.  Such
options are much less useful than ones that have sensible defaults.


Make *does* have sensible defaults. The default is to run serially.

Sometimes computing is hard. Users *do* need a sophisticated understanding oftheir systems. That's life.

For developers of the package being built, it's possible to learn - by
trial and error - what settings tend to work out reasonably sensibly on
specific machines.  I have heard various ad hoc rules of thumb in use,
typically number of CPUs plus either a fixed offset or some fraction of
the total number of CPUs, as either -j or -l value.  In the end, I know
that my old machine grinds to a halt if I let make take load>  about 4
and my shiny new machine accepts -j 8 -l 12 without breaking a sweat -
and gets the build done in a few minutes, instead of half an hour.


Yes. If you know how to improve on trial and error, go ahead and post your code.

For anyone who has to build a package that isn't the primary business of
their work - for example, a distribution maintainer, for whom each
package is just one in a horde of many - configuring the right -j and -l
flags for each run of make is not practical.  It would make sense to
provide them with a sensible way to tell make to make good use it can of
the CPUs available, without bringing the system to its knees.

It might make sense, but it's not the developers' responsibility to know whatis sensible for every machine configuration out there. It is the end user'sresponsibility to know something about the system they're using. You can'texpect developers to even have exposure to all the possible parallelconfigurations on which someone will attempt to build their software.

A distro maintainer probably has a farm of machines to build on. That set ofmachines is already a known quantity to them, because presumably they'vealready done a lot of builds on those machines. They're the ones with the mostknowledge of how the machines behave, not any particular developer.

You can't just encode "# CPUs x <fudge factor>" into the Make source. The CPUsmay be fake to begin with (e.g., SMT with Intel Hyperthreading or SunNiagara). Putting a database of CPU types/knowledge into Make and maintainingit would be ludicrous. End users should know their own machines.

The present unconstrained -j behaviour is, in any case, self-defeating.
It starts so many processes that the time the machine spends swapping
them all out and back in again swamps the time they spend actually doing
anu useful work.  The build would complete sooner if fewer processes
were started.

Pretty sure the documentation already says so too. Anyone using just "make -j"is a moron.

I think R. Diez's suggestions are a constructive step towards designing
some half-way sensible heuristics - those with better understanding of
what make's internal decision-process looks like can doubtless improve
on them, which I'm sure is R. Diez's hope and intent in starting this
discussion.  We don't need -j auto to achieve perfectly optimal tuning
on every machine; it'll be good enough if it can do builds faster than
make's default implicit -j 1, without (too often) loading the machine so
heavily as to interfere with browsing, reading mail, playing nethack and
other things developers like to still be able to do while waiting for a
build.

         Eddy.



--
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Thoughts on limiting the parallel build load, suggestion for a new "-j auto" option, Howard Chu <=

Next by Date: bug in $(wildcard) with trailing slash
Next by thread: bug in $(wildcard) with trailing slash
Index(es):
- Date
- Thread