feature request: parallelize make builds.
current problem: make is serial in nature. there is room for making it series-parallel.
I have been toying with this idea of parallel builds to gain project compile speed (reducing time to a fraction) for quite a while.
compiles seem to spend more time on cpu and memory than they do disk for large monolithic files. but for the typical small files on most projects, I should think they would be more disk bound.
I have 2 machines I could test with (one with 2 threads and 32-bit and 38GB VM and 3GB RAM, one with 12 threads and 64-bit and 64GB memory and 64GB VM), both with about 50+ projects I could parallelize the builds on. but it would take some manual typing work to do this kind of testing if you want it done (I may do it anyway, I want to parallelize my builds somehow to save time and make use of this nice fast proc). right now I am transferring the files to my new machine. so it could be a month before I get anywhere. or less.
the problem with my current build system is, they are batch files and I don't use make. I would have to convert the build systems for all of my 50+ projects, and also redo my build system somehow by making some sort of mingw make template, and I am no make expert. I am just putting this idea out there for someone to grab onto and implement. should I just feed it to the GNU folks via a bug report?
on to the idea...
if you want to compile anything big without losing hair, you should start compiling individual items in parallel where possible. in fact, within that, the compilers if possible should be multithreaded compilers where you can either set the limit on the number of number of threads (as long as it doesn't go over what the system has available) or by default auto-detect the number of threads. although it seems compilers do seem very much a serial thing from what little I remember in my compiler class from 20 years ago...
scale example is CPU-based HPC machines with EMC disks would benefit from this kind of feature. it would make provision for parallel builds for a given project. which brings me to my next point.
I am starting to parallelize my compiles BECAUSE IT MAKES THINGS FASTER up to a limit, which is probably some combination of disk speed and cpu threads.
this will afford more speed since most procs are multithreaded/multicore and windows treats them like cpus. Of course, this is not limited to windows. you can bring this to the mac and to linux also. any platform that has a C++ compiler that compiles lots of individual files.
this will make things disk bound very quickly however, since usually these files reside on a single disk. this is where RAID comes in very handy and provides the extra jump in speed. and this is where the EMC or even small-scale RAID boxes for personal use or 19" RAID racks for work use can come in. you can make build servers
with this work even better by using the processors more efficiently. this is also where you can do things like buy a certain number of threads for faster build time, etc. for cloud build services.
an idea I have is you can have a fixed pool of worker threads assigned as compile job engines, each with their own spooler.
and you need to sync up the jobs internally when finishing a SERIALCOMPILE command.
for a SERIALCOMPILE job list such as a list of .o/obj files you want made from .c files, so that the SERIALCOMPILE command finishes with them all done.
instead of the usual compile command using .c.o: $(CC) etc, I think it was, you do COMPILESERIAL 12 THREADS for a 12-threaded CPU or COMPILESERIAL AUTO THREADS and then your list of .cpp files and what file extension you want them turned into (such as .obj), and what command you want to use to do it. maybe this would havce to take up several lines of make.
or something like
think of it like a glorified make.
at some point I should expect we might even see a parallel build system in place I would hope, once software developers begin to start thinking in terms of parallel builds instead of serial builds - it would cut time by a fraction, but you have to be careful HOW you do it.
some things just have to be done serially. that would just be regular make commands.
I am not sure if I am providing this idea to the right person or not. maybe I should be going to Intel or to GNU or to Microsoft or Apple or all of the above. but I didn't really want one vendor hogging all of the benefits. so I thought I would bring the idea to you. should you want to bring the specification of a parallel build system into the language, I would appreciate this (because I could certainly use it!).
and for us software developers, it would reduce our compile times.
Note that on windows
machines, there is WaitForMultipleObjects() in the Win32 API to handle the issue of waiting for [process, window, thread, whatever] HANDLEs without using while+for loops and some sort of conditional.
personally, I would like to see this in every make and build system and see it made generally available. everyone has multicore processors now. let's make use of them!
[KiB] [MiB] [GiB] [TiB]
SI Units: Hard disk industry disk size measurements:
[KB] [MB] [GB]