Re: Parallelization of shell scripts for 'configure' etc.

From: Alex Ameen
Subject: Re: Parallelization of shell scripts for 'configure' etc.
Date: Mon, 13 Jun 2022 20:37:20 -0500

Yeah honestly splitting most of the `configure` checks into multiple
threads is definitely possible.

Caching between projects is even a straightforward extension with systems
like `Nix`.

The "gotcha" here in both cases is that existing scripts that are living in
source tarballs are not feasible to "regenerate" in the general case. You
could have this ship out with future projects though if project authors
updated to new versions of Autoconf.

If you have a particularly slow package, you can optimize it in a few
hours. Largely this means "identify which tests 100% match the standard
implementation of a check" in which case you can fill in a cached value.
But what I think y'all are asking about is "can I safely use a cache from
one project in another project?" and the answer there is "no not really -
and please don't because it will be a nightmare to debug".

The nasty part about trying to naively share caches is that it will
probably work fine ~90% of the time. The problem is that the 10% that
misbehave are high risk for undefined behavior. My concern is the 0.5% that
appear to work fine, but "whoops we didn't know project X extended a macro
without changing the name - and now an ABI conflict in `gpgp` appears on
the third Sunday of every October causing it skip encryption silently" or
some absurd edge case.

I think optimizating "freshly generated" scripts is totally doable though.

On Mon, Jun 13, 2022, 5:40 PM Paul Eggert <eggert@cs.ucla.edu> wrote:

> In many Gnu projects, the 'configure' script is the biggest barrier to
> building because it takes soooo long to run. Is there some way that we
> could improve its performance without completely reengineering it, by
> improving Bash so that it can parallelize 'configure' scripts?
> For ideas about this, please see PaSh-JIT:
> Kallas K, Mustafa T, Bielak J, Karnikis D, Dang THY, Greenberg M,
> Vasilakis N. Practically correct, just-in-time shell script
> parallelization. Proc OSDI 22. July 2022.
> https://nikos.vasilak.is/p/pash:osdi:2022.pdf
> I've wanted something like this for *years* (I assigned a simpler
> version to my undergraduates but of course it was too much to expect
> them to implement it) and I hope some sort of parallelization like this
> can get into production with Bash at some point (or some other shell if
> Bash can't use this idea).

