[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
04/14: gpce-2017: Fixlets.
From: |
Ludovic Courtčs |
Subject: |
04/14: gpce-2017: Fixlets. |
Date: |
Fri, 1 Sep 2017 11:57:54 -0400 (EDT) |
civodul pushed a commit to branch master
in repository maintenance.
commit 5ffdfc4afd03ec251eec2e7bd1a31186d9c54a14
Author: Ludovic Courtès <address@hidden>
Date: Thu Jul 6 23:52:34 2017 +0200
gpce-2017: Fixlets.
---
doc/gpce-2017/code/gexp-expansion.scm | 4 +-
doc/gpce-2017/gpce.skb | 227 +++++++++++++++++++---------------
doc/gpce-2017/staging.sbib | 4 +-
3 files changed, 128 insertions(+), 107 deletions(-)
diff --git a/doc/gpce-2017/code/gexp-expansion.scm
b/doc/gpce-2017/code/gexp-expansion.scm
index 111fa81..ecd5637 100644
--- a/doc/gpce-2017/code/gexp-expansion.scm
+++ b/doc/gpce-2017/code/gexp-expansion.scm
@@ -34,7 +34,7 @@
#~(let ((x 2))
#$(gen-body #~x))
-⇒ (let ((x0 2))
- (let ((x1 40)) (+ x1 x0)))
+⇝ (let ((x-1bd8-0 2))
+ (let ((x-4f05-0 40)) (+ x-4f05-0 x-1bd8-0)))
;;!end-gexp-hygiene
diff --git a/doc/gpce-2017/gpce.skb b/doc/gpce-2017/gpce.skb
index 8101a93..29afd69 100644
--- a/doc/gpce-2017/gpce.skb
+++ b/doc/gpce-2017/gpce.skb
@@ -157,21 +157,21 @@
(p [GNU Guix is a “functional” package manager that builds upon
earlier work on Nix. Guix implements high-level abstractions such as
packages and operating system services as domain-specific languages
-(DSL) embedded in Scheme, and it also implements build actions and
+(DSLs) embedded in Scheme. It also implements build actions and
operating system orchestration in Scheme. This leads to a multi-tier
programming environment where embedded code snippets are staged for
eventual execution.])
- (p [In this paper we present ,(emph [G-expressions]) or “,(emph
+ (p [This paper presents ,(emph [G-expressions]) or “,(emph
[gexps])”, the staging mechanism we devised for Guix. We explain our
journey from traditional Lisp S-expressions to G-expressions, which
augment the former with contextual information and ensure hygienic code
staging.
We discuss the
implementation of gexps and report on our experience using them in a
variety of operating system use cases—from package build processes
-to system services. To our knowledge, gexps
-provide a unique way to cover many aspects of OS configuration in a
-single, multi-tier language, and to facilitate code reuse and code
+to system services. Gexps
+provide a novel way to cover many aspects of OS configuration in a
+single, multi-tier language, while facilitating code reuse and code
sharing.]))
;; See <http://dl.acm.org/ccs/ccs_flat.cfm>.
@@ -194,7 +194,7 @@ that software build processes are considered as pure
functions: given
a set of inputs (compiler, libraries, build scripts, and so on), a
package’s build function is assumed to always produce the same result.
Build results are stored in an immutable persistent data structure,
-the store, implemented as a single directory, ,(tt [/gnu/store]).
+the ,(emph [store]), implemented as a single directory, ,(tt [/gnu/store]).
Each entry in ,(tt [/gnu/store]) has a file name composed of the hash
of all the build inputs used to produce it, followed by a symbolic
name. For example, ,(tt [/gnu/store/yr9rk90jf…-gcc-7.1.0]) identifies
@@ -202,21 +202,21 @@ a specific build of GCCÂ 7.1. A variant of GCC 7.1, for
instance one
using different build options or different dependencies, would get a
different hash. Thus, each store file name uniquely identifies build
results, and build processes are ,(emph [referentially transparent]).
-This simplifies the reasoning on complex package compositions, but it
+This simplifies reasoning on complex package compositions, and
also has nice properties such as supporting transactional upgrades and
rollback “for free.” While Guix and Nix are package managers, the
-Guix System Distribution (or GuixSD) as well as NixOS extends the
+Guix System Distribution (or GuixSD) as well as NixOS extend the
functional paradigm to whole operating system deployments ,(ref :bib
'dolstra2010:nixos).])
(p [While Guix implements this functional deployment paradigm
pioneered by Nix, we explained in previous work that its
implementation departs from Nix in interesting ways ,(ref :bib
'courtes2013:functional). First, while Nix relies on a custom
-domain-specific language (DSL), the Nix language, Guix instead chooses
-to devise a set of DSLs and data structures embedded in the
-general-purpose language Scheme. The rationale was that this approach
-would ease the development of user interfaces and tools dealing with
-packages, and would allow users to benefit from everything a
+domain-specific language (DSL), the Nix language, Guix instead
+implements a set of DSLs and data structures embedded in the
+general-purpose language Scheme. This simplifies
+the development of user interfaces and tools dealing with
+packages, and allows users to benefit from everything a
general-purpose language brings: compiler, debugger, REPL, editor
support, libraries, and so on. Four years later, Guix has indeed
gained rich tooling that would have been harder to develop for an
@@ -264,28 +264,31 @@ perform the build (the ,(emph [build program])),
environment variables to be
defined, and derivations whose build result it depends on.
Derivations are sent to a privileged daemon, which is responsible for
building them on behalf of clients. The build daemon creates isolated
-environments (isolated ,(emph [containers]) in a chroot) in which it spawns
-the build program; since build environments are isolated, this ensures
+environments (,(emph [containers]) in a chroot) in which it spawns
+the build program; isolated build environments ensure
that build programs do not depend on undeclared inputs.])
(p [The second way in which Guix departs from Nix is by using
the same language, Scheme, for all its functionality. While package
definitions in Nix can embed Bash or Perl snippets to refine build
-steps, Guix package definitions would instead embed Scheme snippets.
-Consequently, we have two strata of Scheme code: the “host code”,
-which provides the package definition, and the “build code”, which is
+steps, Guix package definitions instead embed Scheme code.
+Consequently, we have two strata of Scheme code: the ,(emph [host code]),
+which provides the package definition, and the ,(emph [build code]), which is
staged for later execution by the build daemon. Our thesis is that
this single-language, “multi-tier” approach facilitates code reuse and
code sharing among the several tiers, and that it can avoid a whole
class of errors in the staged code—as opposed to generation of code in
a “foreign” language, which is treated a mere strings where syntactic
and semantic errors cannot be detected by the host code.])
- (p [This paper focus on code staging in Guix. Our contribution
+ (p [This paper focuses on code staging in Guix. Our contribution
is twofold: we present G-expressions (or “gexps”), a new code staging
mechanism implemented through mere syntactic extensions of the Scheme
-language, and its use in several areas of the “orchestration” programs
-of the operating system. ,(numref :text [Section] :ident "design")
-describes the evolution of code staging in Guix from its inception, as
-described in ,(ref :bib 'courtes2013:functional), to gexps. ,(numref
+language; we show the use of gexps in several areas of the “orchestration”
programs
+of the operating system. ,(numref :text [Section] :ident "origins")
+discusses the early attempt at code staging in Guix, as
+mentioned in ,(ref :bib 'courtes2013:functional), and its shortcomings.
+,(numref :text [Section] :ident "gexps") presents the design and
+implementation of gexps.
+,(numref
:text [Section] :ident "experience") reports on our experience using
gexps in a variety of areas in Guix and GuixSD. ,(numref :text
[Section] :ident "limitations") discusses limitations and future work.
@@ -293,15 +296,16 @@ Finally ,(numref :text [Section] :ident "related")
compares gexps to
related work and ,(numref :text [Section] :ident "conclusion")
concludes.]))
- (chapter :title [Design and Implementation]
- :ident "design"
+ (chapter :title [Early Attempt]
+ :ident "origins"
(p [Scheme is a dialect of Lisp, and Lisp is famous for its
homoiconicity—the fact that code has a direct representation as a data
structure using the same syntax. “S-expressions” or “sexps”, Lisp’s
parenthecal expressions, thus look like they lend themselves to code
-staging. In this section we show how we started with sexps to end up
-with gexps as an “augmented” version of sexps.])
+staging.
+In this section we show how we this early experience made it clear that
+we needed an ,(emph [augmented]) version of sexps.])
(section :title [Staging Build Expressions]
@@ -316,18 +320,20 @@ with gexps as an “augmented” version of sexps.])
(p [In previous work ,(ref :bib 'courtes2013:functional), we
presented our first attempt at writing build expressions in Scheme,
-which relied solely on Lisp’s famous quotation mechanism ,(ref :bib
+which relied solely on Lisp quotation ,(ref :bib
'bawden1999:quasiquotation). Figure ,(ref :figure "fig-build-sexp")
shows an example that creates a derivation that, when built, converts
the input image to JPEG, using the ,(tt [convert]) program from the
-ImageMagick package. In this example, variable ,(tt [store])
+ImageMagick package—this is equivalent to a three-line makefile, but
+referentially transparent. In this example, variable ,(tt [store])
represents the connection to the build daemon. The ,(tt
[package-derivation]) function takes the ,(tt [imagemagick]) package
object and computes its corresponding derivation, while the ,(tt
[add-to-store]) remote procedure call (RPC) instructs the daemon to
add the file ,(tt [GuixSD.png]) to ,(tt [/gnu/store]). The variable
-,(tt [build]) contains our build program as an sexp, thanks to the
-apostrophe, which means “quote” in Lisp. Finally, ,(tt
+,(tt [build]) contains our build program as an sexp (the
+apostrophe is equivalent to ,(tt [quote]); it introduces unevaluated
+code). Finally, ,(tt
[build-expression->derivation]) takes the build program and computes
the corresponding derivation without building it. The user can then
make an RPC to the build daemon asking it to build this derivation;
@@ -351,20 +357,20 @@ pleasant ,(tt [package]) interface shown in Figure ,(ref
:figure
derivation and its dependencies, but where does the verbosity come
from? First, we have to explicitly call ,(tt [package-derivation])
for each package the expression refers to. Second, we have to
-explicitly the inputs with labels at the call site. Third, the build
+specify the inputs with labels at the call site. Third, the build
code has to use this ,(tt [assoc-ref]) call just to retrieve the ,(tt
-[/gnu/store]) file name of its inputs. It is also error-prone: if we
+[/gnu/store]) file name of its inputs. It is error-prone: if we
omit the ,(tt [#:inputs]) parameter, of if we mispell an input label,
we will only find out when we build the derivation.])
(p [Another limitation not visible on a toy example but that
-became clear as we developed GuixSD it the cost of carrying this ,(tt
+became clear as we developed GuixSD is the cost of carrying this ,(tt
[#:inputs]) argument down to the call site. It forces programmers to
carry not only the build expression, ,(tt [build]), but also the
-corresponding ,(tt [inputs]) argument down to the call site. This
-essentially makes it very hard to compose build expressions.])
- (p [While ,(tt [quote]) allowed to easily represent code as
-expected, it clearly lacks some of the machinery that would make
-staging in Guix more convenient. It boilds down to two things: it
+corresponding ,(tt [inputs]) argument, and
+makes it very hard to compose build expressions.])
+ (p [While ,(tt [quote]) allowed us to easily represent code, it
+clearly lacked some of the machinery that would make
+staging in Guix more convenient. It boils down to two things: it
lacks ,(emph [context])—the set of inputs associated with the
expression—and it lacks the ability to serialize high-level objects—to
replace a reference to a package object with its ,(tt [/gnu/store])
@@ -373,7 +379,8 @@ file name.])))
(chapter :title [G-Expressions]
:ident "gexps"
- (p [This section describes the design and implementation of
+ (p [We devised “G-expressions” as a mechanism to address
+these shortcomings. This section describes the design and implementation of
G-expressions, as well as extensions we added to address new use
cases.])
@@ -388,8 +395,7 @@ cases.])
:start ";!begin-imagemagick-gexp"
:stop ";!end-imagemagick-gexp")))
- (p [We devised “G-expressions” as a mechanism to address
-these shortcomings. In essence, a gexp bundles an sexp and its inputs
+ (p [In essence, a gexp bundles an sexp and its inputs
and outputs, and it can be serialized with ,(tt [/gnu/store]) file
names substituted as needed. We first define two operators:
@@ -398,8 +404,8 @@ names substituted as needed. We first define two operators:
Scheme’s ,(tt [quasiquote]): it allows users to describe unevaluated
code.])
(item [,(tt [ungexp]), abbreviated ,(tt [#$]), is the counterpart
-of Scheme’s ,(tt [unquote]): it allows quoted to refer to values in
-the host language. These values can be of any of Scheme’s primitive
+of Scheme’s ,(tt [unquote]): it allows quoted code to refer to values in
+the host program. These values can be of any of Scheme’s primitive
data types, but we are specifically interested in values such as
package objects that can be “compiled” to elements in the store.])
(item [,(tt [ungexp-splicing]), abbreviated ,(tt address@hidden), allows a
@@ -408,7 +414,7 @@ Scheme’s ,(tt [unquote-splicing]).]))
The example in Figure ,(ref :figure "fig-build-sexp"), rewritten as a
gexp, is shown in Figure ,(ref :figure "fig-build-gexp"). We have all
-the properties we were looking for: the gexp contains carries
+the properties we were looking for: the gexp carries
information about its inputs that does not need to be passed at the
,(tt [gexp->derivation]) call site, and the reference to ,(tt
[imagemagick]), which is bound to a package object, is automatically
@@ -419,11 +425,11 @@ is because we implemented ,(tt [gexp->derivation]) as a
monadic
function in the ,(emph [state monad]), where the state threaded
through monadic function calls is that store parameter. The use of a
monadic interface is completely orthogonal to the gexp design though,
-so we will not insist on it.]) ,(tt [local-file]) returns a new
-Scheme record that denotes a file from the local file system to be
+so we will not insist on it.]). ,(tt [local-file]) returns a new
+record that denotes a file from the local file system to be
added to the store.])
(p [Under the hood, ,(tt [gexp->derivation]) converts the
-gexp to an sexp, the final build program, stored under ,(tt
+gexp to an sexp, the residual build program, and stores it under ,(tt
[/gnu/store]). In doing that, it replaces the ,(tt [ungexp]) forms
,(tt [#$imagemagick]) and ,(tt [#$image]) with their corresponding
,(tt [/gnu/store]) file names. The special ,(tt [#$output]) form,
@@ -447,7 +453,8 @@ is created.])
lexical scope across stages]) ,(ref :bib '(rhiger2012:hygienic
kiselyov2008:metascheme kohlbecker1986:hygienic)).]
- (figure :legend [Lexical scope preservation across stages.]
+ (figure :legend [Lexical scope preservation across stages (⇝
+denotes code generation).]
:ident "fig-gexp-hygiene"
(prog :line #f
@@ -460,7 +467,8 @@ well-known properties of hygienic multi-stage programs:
first, binding
,(tt [x]) in one stage (outside the gexp) is distinguished from
binding ,(tt [x]) in another stage (inside the gexp); second, binding
,(tt [x]) introduced inside ,(tt [gen-body]) does not shadow binding
-,(tt [x]) in the outer gexp thanks to the renaming of these variables.]))
+,(tt [x]) in the outer gexp thanks to the renaming of these variables
+in the residual program.]))
(section :title [Implementation]
:ident "implementation"
@@ -475,16 +483,16 @@ generation (⇝).]
:start ";;!begin-gexp-expansion"
:stop ";;!end-gexp-expansion")))
- (p [As can be seen from the example above, gexps are
+ (p [As can be seen from the examples above, gexps are
first-class Scheme values: a variable can be bound to a gexp, and
gexps can be passed around like any other value. The implementation
consists of two parts: a syntactic layer that turns ,(tt [#~]) forms
into code that instantiates gexp records, and run-time support
-procedures to serialize gexps and to “lower” their inputs.])
- (p [Scheme is extensible through macros, so ,(tt [gexp]) is a
-“hygienic” ,(tt [syntax-case]) macro ,(ref :bib
+functions to serialize gexps and to ,(emph [lower]) their inputs.])
+ (p [Scheme is extensible through macros, and ,(tt [gexp]) is a
+,(tt [syntax-case]) macro ,(ref :bib
'dybvig1992:syntax-case); ,(tt [#~]) and ,(tt [#$]) are ,(it [reader
-macros]) that expand to a ,(tt [gexp]) or ,(tt [ungexp]) sexps. This
+macros]) that expand to ,(tt [gexp]) or ,(tt [ungexp]) sexps. This
is implemented as a library for GNU,(~)Guile, an R5RS/R6RS Scheme
implementation, ,(emph [without any modification to its compiler]).
Figure ,(ref :figure "fig-gexp-expansion") shows what our ,(tt [gexp])
@@ -492,20 +500,22 @@ macro expands to. In the expanded code, ,(tt
[gexp-input]) returns a
record representing a dependency, while ,(tt [make-gexp]) returns a
record representing the whole gexp. The expanded code defines a
function of two arguments, ,(tt [proc]), that returns an sexp; the
-sexp is simply the body of the gexp with these two arguments inserted
+sexp is the body of the gexp with these two arguments inserted
at the point where the original ,(tt [ungexp]) forms appeared.
-Intenally, ,(tt [gexp->sexp]), the function that converts gexps to
+Internally, ,(tt [gexp->sexp]), the function that converts gexps to
sexps, calls this two-argument procedure passing it the store file
names of ImageMagick and Emacs. This strategy gives us constant-time
substitutions.])
- (p [The internal ,(tt [gexp-input]) function returns, for a
+ (p [The internal ,(tt [gexp-inputs]) function returns, for a
given gexp, store, and system type, the derivations that the gexp
depends on. In this example, it returns the derivations for
ImageMagick and Emacs, as computed by the ,(tt [package-derivation])
function seen earlier. Gexps can be nested, as in ,(tt
[#~#$#~(string-append #$emacs "/bin/emacs")]). The input list
returned by ,(tt [gexp-inputs]) for the outermost gexp is the sum of
-the inputs of outermost gexp and the inputs nested gexps.])
+the inputs of the outermost gexp and the inputs nested gexps. Likewise,
+,(tt [gexp-outputs]) returns the outputs declared in a gexp and in
+nested gexps.])
(p [The ,(tt [gexp]) macro performs several passes on its body:
,(enumerate
@@ -520,12 +530,15 @@ the literature, identifiers must be generated in a ,(emph
[deterministic]) fashion: if they were not, we would produce different
derivations at each run, which in turn would trigger full rebuilds of
the package graph. Thus, instead of relying on ,(tt [gensym]) and
-,(tt [generate-temporaries]), we generate identifiers using a hash for
-the input expression as a stem, along with lexical nesting level of
-the identifer.])
+,(tt [generate-temporaries]), we generate identifiers as a function of
+the hash of
+the input expression and of the lexical nesting level of
+the identifier—these are the two components we can see in the generated
+identifiers of Figure ,(ref
+:figure "fig-gexp-hygiene").])
(item [The second pass ,(emph [collects the escape forms]) (,(tt
[ungexp]) variants) in the input source. The list of escape forms is
-needed to construct the list of inputs recorded in the ,(tt [<gexp>])
+needed to construct the list of inputs stored in the gexp
record, and to construct the formal argument list of the gexp’s code
generation function shown in Figure ,(ref :figure
"fig-gexp-expansion").])
@@ -558,8 +571,8 @@ by ,(tt [gexp->sexp]) when it encounters instances of the
relevant
type in a gexp that is being processed.])
(p [Gexp compilers can also have an associated ,(emph
[expander]), which specifies how objects should be “rendered” in the
-final sexp. The default expander simply produces the store file name
-that corresponds to the output of the derivation. For example,
+residual sexp. The default expander simply produces the store file name
+of the derivation output. For example,
assuming the variable ,(tt [emacs]) is bound to a package object, ,(tt
[#~(string-append #$emacs "/bin/emacs")]) expands to ,(tt
[(string-append "/gnu/store/…-emacs-25.2" "/bin/emacs")]), as we have
@@ -576,19 +589,19 @@ when generating the sexp. We can now write gexps like:]
(!latex "\\\\[0.3cm]\n")
[This is convenient in situations where we do not want or cannot impose
-a build-side ,(tt [string-append]) code.]))
+a ,(tt [string-append]) call in staged code.]))
(section :title [Extensions]
(figure
- :legend [Specifying importing modules in a gexp.]
+ :legend [Specifying imported modules in a gexp.]
:ident "fig-gexp-modules"
(prog :line #f
(source :language guix :file "code/gexp-modules.scm")))
(p [,(bold [Modules.]) One of the reasons for using the same
-language uniformly is the ability to reuse Guile modules among in
+language uniformly is the ability to reuse Guile modules in
several contexts. Since builds are performed in an isolated
environment, Scheme modules that are needed must be explicitly ,(emph
[imported]) in that environment; in other words, the modules must be
@@ -598,7 +611,7 @@ objects embed information about the modules they need; the
,(tt
modules to import in the gexps that appear in its body. The example
in Figure ,(ref :figure "fig-gexp-modules") creates a gexp that
requires the ,(tt [(guix build utils)]) module and the modules it
-depends on in its execution environment. The source of these module
+depends on in its execution environment. The source of these modules
is taken from the user’s search path and added to the store when ,(tt
[gexp->derivation]) is called.])
(p [Note that, to actually bring the module in scope, we
@@ -629,36 +642,40 @@ background image to a suitable format, which resembles
that of Figure
expression that converts the image should use the ,(emph [native])
ImageMagick, not the target ImageMagick, which it would not be able to
run anyway. Thus, we write ,(tt [#+imagemagick]) rather than ,(tt
-[#$imagemagick]).])))
+[#$imagemagick]). “Nativeness” propagates to all the values beneath
+,(tt [#+]).])))
(chapter :title [Experience]
:ident "experience"
- (p [Guix is used in production by individuals and organizations.
-This section reports on our experience using gexps in Guix.])
+ (p [Guix and GuixSD are used in production by individuals and
+organizations to deploy software on laptops, servers, and clusters.
+Introducing a new core mechanism in such a project can be both fruitful
+and challenging. This section reports on our experience using gexps in
+Guix.])
(section :title [Package Build Procedures]
(p [As explained earlier, gexps appeared quite recently in
the history of Guix. Package definitions like that of Figure ,(ref
-:figure "fig-package-def") relied on the previous ad-hoc staging
-mechanism. This can be seen in the use of labels in the ,(tt
+:figure "fig-package-def") rely on the previous ad-hoc staging
+mechanism, as can be seen in the use of labels in the ,(tt
[inputs]) field of definitions. Guix today includes more than 5,500
packages, which still use this old, pre-gexp style. We are
-considering a migration to the new style but given the size of the
+considering a migration to a new style but given the size of the
repository, this is a challenging task and we must make sure every use
case is correctly addressed in the new model.])
- (p [In theory, labels are no longer needed with the use of
-gexps since one can now use a ,(tt [#$]) escape when they need to
-refer to the absolute file name of an inputs. The indirection that
+ (p [In theory, labels are no longer needed with
+gexps since one can now use a ,(tt [#$]) escape to
+refer to the absolute file name of an input in ,(tt [arguments]). The
indirection that
labels introduced had one benefit though: one could create a package
-variant with a different input, and ,(tt [(assoc-ref %build-inputs …])
+variant with a different ,(tt [inputs]) field, and ,(tt [(assoc-ref
%build-inputs …)])
calls in build-side code would automatically resolve to the new
-package. If we instead allow for direct use of ,(tt [#$]) in package
+dependencies. If we instead allow for direct use of ,(tt [#$]) in package
,(tt [arguments]), those will be unaffected by changes in ,(tt
-[inputs]), which would break this particular use case. It remains to
+[inputs]). It remains to
be seen how we can allow ,(tt [#$]) forms while not sacrificing this
-flexibility.)]))
+flexibility.]))
(section :title [System Services]
@@ -682,7 +699,7 @@ the kernel Linux.])
:stop ";;!end-initrd")))
(p [The initrd is a small file system image that the kernel
-Linux mounts as its initial file system. It then runs the ,(tt
+Linux mounts as its initial root file system. It then runs the ,(tt
[/init]) program therein; this program is responsible for mounting the
real root file system and for loading any drivers needed to achieve
that. If the file system is encrypted, this is also the place where a
@@ -691,7 +708,7 @@ is a Scheme program that we generate based on the OS
configuration,
using gexps. Figure ,(ref :figure "fig-initrd") illustrates the
creation of an initrd. Here ,(tt [expression->initrd]) returns a
derivation that builds an initrd containing the given gexp as the ,(tt
-[/init]) program. The staged program in this examples calls the ,(tt
+[/init]) program. The staged program in this example calls the ,(tt
[boot-system]) function from the ,(tt [(gnu build linux-boot)])
module. The initrd is automatically populated with Guile and its
dependencies, the closure of the ,(tt [(gnu build linux-boot)])
@@ -733,9 +750,9 @@ case is the operating system’s run-time environment.]))
(p [GuixSD comes with a set of ,(emph [whole-system tests]).
Each of them takes an ,(tt [operating-system]) definition, which defines
the OS configuration, instantiates it in a virtual machine (VM), and
-verifies that system running in a VM matches some of the settings. The
+verifies that the system running in the VM matches some of the settings. The
guest OS is instrumented with a Scheme interpreter that evaluates
-expressions sent by the host OS (we call it “marionette”).])
+expressions sent by the host OS—we call it “marionette”.])
(p [Whole-system tests are derivations whose build programs are
gexps that resemble that of Figure ,(ref :figure "fig-system-test").
The build program passes ,(tt [run]), the script to spawn the VM, to the
@@ -755,11 +772,11 @@ in ,(numref :text [Section] :ident "implementation"),
follows the
well-documented approach to the problem ,(ref :bib
'(rhiger2012:hygienic kiselyov2008:metascheme)). Rhiger’s
implementation handles a single binding construct (,(tt [lambda])) and
-MetaScheme handles a couple more constructs, but of course, ours had
-to deal with many more binding constructs: R6RS defines around ten
+MetaScheme handles a couple more constructs, but ours has
+to deal with more binding constructs: R6RS defines around ten
binding constructs (including binding constructs for syntactic
keywords such as ,(tt [let-syntax])), and Guile adds a couple more.])
- (p [Fundamentally, this is all about identifying binding
+ (p [Hygiene in multi-stage programs relies on identifying binding
constructs. This turns out to be hard to achieve in Scheme because
macros can define ,(emph [new]) bindings constructs.
Our ,(symbol "alpha")-renaming pass is oblivious to those so it will
@@ -775,7 +792,8 @@ Guile variant used to evaluate “host-side” code. How we
could hook
into Guile’s macro expander, based on ,(tt [psyntax]) ,(ref :bib
'dybvig1992:syntax-case), is still an open question. To our
knowledge, this problem of hygienic staging of a language with macros
-has not been addressed in literature.])
+has not been addressed in literature outside of work on macro expanders
+,(ref :bib 'dybvig1992:syntax-case).])
(p [On top of that, ,(tt [gexp]) must track the ,(emph
[quotation level]) of several types of quotation: ,(tt [gexp]), ,(tt
[quote]), ,(tt [quasiquote]), and ,(tt [syntax]) (though our
@@ -794,9 +812,12 @@ specify ,(emph [which modules should be in scope]), which
could be
useful in some situations. Part of the reason is that in Guile ,(tt
[use-modules]) clauses must appear at the top level, and thus they
cannot be used in a gexp that ends up being inserted in a
-non-top-level position. Scoped ,(tt [use-modules]) clauses would help
-to some extent, but there are still open questions open question
-regarding potential name clashes.])
+non-top-level position. Macro expanders know the modules in scope at
+macro-definition points so they can replace free variables in residual
+code with fully-qualified references to variables inside the modules
+in scope at the macro definition point. How to achieve something
+similar with gexp, which lack the big picture that a macro expander has,
+remains an open question.])
(p [,(bold [Cross-stage debugging.]) ,(tt [gexp->derivation])
emits build programs as sexps in a file in ,(tt [/gnu/store]), using
Scheme ,(tt [write]), which writes the whole sexp as one line. When
@@ -810,8 +831,8 @@ feature was available in Scheme, it would be unsuitable:
moving the
source code where a gexp appears would lead to a different derivation,
in turn triggering a rebuild of everything that depends on it, which
is undesirable. Instead we would need a way to pass source code
-mapping information ,(emph [off-band]), in a way that does not affect
-the derivation that is produced. We are still investigating ways to
+mapping information ,(emph [out-of-band]), in a way that does not affect
+the derivation that is produced. We are investigating ways to
achieve that.]))
(chapter :title [Related Work]
@@ -830,19 +851,18 @@ derivation, the Nix interpreter records this dependency
in the string
context and substitutes the reference with the output file name of the
derivation.])
(p [Because Nix views this generated code as mere strings, it
-does provide any guarantee on the generated code (notably syntactic
+does not provide any guarantee on the generated code (notably syntactic
correctness). The string interpolation syntax (,(tt [${])…,(tt [}])
sequences), often clashes with the target’s language syntax (e.g.,
Bash uses dollar-brace syntax to reference variables), which can lead
-to subtle errors and contrain developers to resort to non-trivial
+to subtle errors and constrain developers to resort to non-trivial
escaping syntax. The “code-as-string” paradigm also has other side
-effects: comments and whitespace in those strings is preserved, which
-means those can trigger a rebuild of the derivation, which is
+effects: comments and whitespace in those strings is preserved, and
+changing those triggers a rebuild of the derivation, which is
inconvenient.])
(p [Code staging in Scheme has been studied in the context of
-,(emph [macros]). Dybvig’s work ,(ref :bib 'dybvig1992:syntax-case)
-introduced “hygienic” macros in Scheme—i.e., macros that generate
-well-scoped code, without unintended capture of variables—which later
+,(emph [hygienic macros])—i.e., macros that generate
+well-scoped code, without unintended capture of variables ,(ref :bib
'(kohlbecker1986:hygienic dybvig1992:syntax-case))—which later
made it into the Sixth Report on Scheme (R6RS). MacroML achieves
something similar in the context of ML, which is statically-typed
,(ref :bib 'ganz2001:macroml). Both tools allow users to define new
@@ -891,7 +911,8 @@ G-expressions, support for tilde forms is built in the Hop
compiler,
and tilde forms are not first-class objects. Hop comes with useful
multi-stage debugging facilities not found in Guix, such as the
ability to display cross-stage stack traces with correct source
-location information.])
+location information. It also has a way to express modules in scope for
+staged code.])
;; See refs at
https://www.researchgate.net/publication/2632322_Writing_Hygienic_Macros_in_Scheme_with_Syntax-Case
diff --git a/doc/gpce-2017/staging.sbib b/doc/gpce-2017/staging.sbib
index c12b605..d4e0fd0 100644
--- a/doc/gpce-2017/staging.sbib
+++ b/doc/gpce-2017/staging.sbib
@@ -8,7 +8,7 @@
(url "http://www.cs.indiana.edu/~dyb/pubs/tr356.pdf"))
(inproceedings kohlbecker1986:hygienic
- (author "Kohlbecker, Eugene and Friedman, Daniel P. and Felleisen, Matthias
and Duba, Bruce")
+ (author "Eugene Kohlbecker, Daniel P. Friedman, Matthias Felleisen, and
Bruce Duba")
(title "Hygienic Macro Expansion")
(booktitle "Proceedings of the 1986 ACM Conference on LISP and Functional
Programming")
(series "LFP '86")
@@ -95,7 +95,7 @@ Evaluation and Semantics-Based Program Manipulation (PEPM
1999)")
(url "http://repository.readscheme.org/ftp/papers/pepm99/bawden.pdf"))
(inproceedings rhiger2012:hygienic
- (author "Rhiger, Morten")
+ (author "Morten Rhiger")
(title "Hygienic Quasiquotation in Scheme")
(booktitle "Proceedings of the 2012 Annual Workshop on Scheme and Functional
Programming")
(series "Scheme '12")
- branch master updated (036cd84 -> fb91f64), Ludovic Courtčs, 2017/09/01
- 11/14: gpce-2017: Enable hyphenation at hyphens., Ludovic Courtčs, 2017/09/01
- 10/14: gpce-2017: Adjust as suggested by the reviewers., Ludovic Courtčs, 2017/09/01
- 14/14: gpce-2017: Adjust ACM boilerplate., Ludovic Courtčs, 2017/09/01
- 05/14: gpce-2017: Shrink., Ludovic Courtčs, 2017/09/01
- 01/14: doc: Add GPCE paper., Ludovic Courtčs, 2017/09/01
- 08/14: gpce-2017: Add an explicit license., Ludovic Courtčs, 2017/09/01
- 09/14: gpce-2017: Fix typo., Ludovic Courtčs, 2017/09/01
- 03/14: gpce-2017: Write some more., Ludovic Courtčs, 2017/09/01
- 04/14: gpce-2017: Fixlets.,
Ludovic Courtčs <=
- 07/14: gpce-2017: Deanonymize., Ludovic Courtčs, 2017/09/01
- 13/14: gpce-2017: Shrink to 7 pages (10pt font)., Ludovic Courtčs, 2017/09/01
- 06/14: gpce-2017: Tweak some more., Ludovic Courtčs, 2017/09/01
- 02/14: gpce-2017: Write, write, write., Ludovic Courtčs, 2017/09/01
- 12/14: gpce-2017: Use acmart v1.47., Ludovic Courtčs, 2017/09/01