bug-hello
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: new feature for "hello"


From: Terence Kelly
Subject: Re: new feature for "hello"
Date: Fri, 12 Aug 2022 02:30:55 -0400 (EDT)
User-agent: Alpine 2.22 (DEB 394 2020-01-19)



Hi Roland & Reuben,

Thanks for your quick and detailed replies.

Regarding "Reproducible Builds", I'm thinking along different lines. Their motives emphasize security considerations and their goals include making builds deterministic. All else being equal that's a fine goal, but it would seem to make it difficult to exploit advances in compiler optimizations, for example. And I wonder how much real improvement in security it buys us.

My suggestion is different: If (for example) the "grep" executable that's first in your $PATH is a literate executable, invoking it as "grep --dump-tgz" will spit out the tarball whence it came. The chicken (executable) can lay an egg (tarball) that can be built into another chicken. The two chickens need not be bit-for-bit identical, but the second chicken will be able to lay an egg that is bit-for-bit identical to the first egg. The phenotype may change from one generation to the next, but the genotype remains intact. (This aspect of conformity, of course, is easy to verify independently: Given a tarball, build an executable and then extract the tarball within the executable, which should be identical to the tarball you started with.)

Again, my primary motive is not security. It is to prevent executables and the corresponding source code from wandering so far apart that having the former leaves you clueless about the latter. Stallman's Four Software Freedoms include at least two that require source access. In practice on today's GNU/Linux systems, even highly experienced users can't find the exact sources behind the utilities and libraries they use every day. This need not be the case; the fix is very easy.

So to answer Roland's specific question, "what should be included in the executable and what should be left out?", the grep executable should contain, and emit on demand, the grep.tar.gz from which it was (purportedly) built. There's nothing to be said about patches nor build tools nor OS nor toolchain. (The proposal might be termed "eggcentric," the egg being the center of attention.)

The conceptual framework of literate executables begins with the axiom that a source tarball is the One True Definitive Embodiment of a piece of software. In the case of grep version 1.2.3, the One True Definitive Embodiment is grep_1.2.3.tar.gz. That's the thing you audit, the thing you read to find out how it works, the starting point for fixing bugs and adding functionality, the thing you need in order to re-build to get a faster executable that exploits recent advances in compiler technology, and the thing that the code's authors should sign to prove authenticity. It's also the ancestor of myriad executables built for myriad hardware platforms. Not all software conforms to the axiom that it exists as a tarball, but many GNU packages do. I've been working with gawk lately, for example, and there's an FSF site where every release of gawk can be downloaded as a .tar.gz file.

Thanks again for your reply, and feel free to ask further questions. I might implement literate executables for (a fork of) the "hello" package.

-- Terence




On Thu, 11 Aug 2022, Roland Illig wrote:

Am 11.08.2022 um 00:44 schrieb Terence Kelly:
I call this feature "literate executables" in homage to Knuth's
"literate programming":  binaries such as executables and libraries
(.so) compiled from C/C++ can emit their own source code on demand.

Where do you draw the line between "source code", "patches applied after
extracting the archive", "tool used to build the program", "operating
system below the toolchain"? So what should be included in the
executable and what should be left out?

How does your approach compare to https://reproducible-builds.org/,
which records similar information but stores it outside the resulting
binary?

Roland



reply via email to

[Prev in Thread] Current Thread [Next in Thread]