[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
04/06: publish: Add '--cache' and '--workers'.
From: |
Ludovic Courtès |
Subject: |
04/06: publish: Add '--cache' and '--workers'. |
Date: |
Tue, 18 Apr 2017 17:20:08 -0400 (EDT) |
civodul pushed a commit to branch master
in repository guix.
commit 00753f7038234a0f5a79be3ec9ab949840a18743
Author: Ludovic Courtès <address@hidden>
Date: Mon Apr 17 23:13:40 2017 +0200
publish: Add '--cache' and '--workers'.
Fixes <http://bugs.gnu.org/26201>.
Reported by <address@hidden>.
These options allow nars to be "baked" off-line and cached instead of
being compressed on the fly. As a side-effect, this allows us to
provide a 'Content-Length' header for nars.
* guix/scripts/publish.scm (show-help, %options): Add '--cache' and
'--workers'.
(%default-options): Add 'workers'.
(nar-cache-file, narinfo-cache-file, run-single-baker): New procedures.
(single-baker): New macro.
(render-narinfo/cached, bake-narinfo+nar)
(render-nar/cached): New procedures.
(make-request-handler): Add #:cache and #:pool parameters and honor
them.
(run-publish-server): Likewise.
(guix-publish): Honor '--cache' and '--workers'.
* tests/publish.scm ("with cache"): New test.
* doc/guix.texi (Invoking guix publish): Document it.
---
doc/guix.texi | 46 ++++++++++-
guix/scripts/publish.scm | 197 +++++++++++++++++++++++++++++++++++++++++++----
tests/publish.scm | 54 +++++++++++++
3 files changed, 280 insertions(+), 17 deletions(-)
diff --git a/doc/guix.texi b/doc/guix.texi
index fd3483e..bbb2ba7 100644
--- a/doc/guix.texi
+++ b/doc/guix.texi
@@ -6522,6 +6522,13 @@ archive}), the daemon may download substitutes from it:
guix-daemon --substitute-urls=http://example.org:8080
@end example
+By default, @command{guix publish} compresses archives on the fly as it
+serves them. This ``on-the-fly'' mode is convenient in that it requires
+no setup and is immediately available. However, when serving lots of
+clients, we recommend using the @option{--cache} option, which enables
+caching of the archives before they are sent to clients---see below for
+details.
+
As a bonus, @command{guix publish} also serves as a content-addressed
mirror for source files referenced in @code{origin} records
(@pxref{origin Reference}). For instance, assuming @command{guix
@@ -6559,10 +6566,43 @@ disable compression. The range 1 to 9 corresponds to
different gzip
compression levels: 1 is the fastest, and 9 is the best (CPU-intensive).
The default is 3.
-Compression occurs on the fly and the compressed streams are not
+Unless @option{--cache} is used, compression occurs on the fly and
+the compressed streams are not
cached. Thus, to reduce load on the machine that runs @command{guix
-publish}, it may be a good idea to choose a low compression level, or to
-run @command{guix publish} behind a caching proxy.
+publish}, it may be a good idea to choose a low compression level, to
+run @command{guix publish} behind a caching proxy, or to use
address@hidden Using @option{--cache} has the advantage that it
+allows @command{guix publish} to add @code{Content-Length} HTTP header
+to its responses.
+
address@hidden address@hidden
address@hidden -c @var{directory}
+Cache archives and meta-data (@code{.narinfo} URLs) to @var{directory}
+and only serve archives that are in cache.
+
+When this option is omitted, archives and meta-data are created
+on-the-fly. This can reduce the available bandwidth, especially when
+compression is enabled, since this may become CPU-bound. Another
+drawback of the default mode is that the length of archives is not known
+in advance, so @command{guix publish} does not add a
address@hidden HTTP header to its responses, which in turn
+prevents clients from knowing the amount of data being downloaded.
+
+Conversely, when @option{--cache} is used, the first request for a store
+item (@i{via} a @code{.narinfo} URL) returns 404 and triggers a
+background process to @dfn{bake} the archive---computing its
address@hidden and compressing the archive, if needed. Once the
+archive is cached in @var{directory}, subsequent requests succeed and
+are served directly from the cache, which guarantees that clients get
+the best possible bandwidth.
+
+The ``baking'' process is performed by worker threads. By default, one
+thread per CPU core is created, but this can be customized. See
address@hidden below.
+
address@hidden address@hidden
+When @option{--cache} is used, request the allocation of @var{N} worker
+threads to ``bake'' archives.
@item address@hidden
Produce @code{Cache-Control} HTTP headers that advertise a time-to-live
diff --git a/guix/scripts/publish.scm b/guix/scripts/publish.scm
index f54757b..70d914d 100644
--- a/guix/scripts/publish.scm
+++ b/guix/scripts/publish.scm
@@ -24,6 +24,7 @@
#:use-module (ice-9 match)
#:use-module (ice-9 regex)
#:use-module (ice-9 rdelim)
+ #:use-module (ice-9 threads)
#:use-module (rnrs bytevectors)
#:use-module (srfi srfi-1)
#:use-module (srfi srfi-2)
@@ -45,13 +46,15 @@
#:use-module (guix hash)
#:use-module (guix pki)
#:use-module (guix pk-crypto)
+ #:use-module (guix workers)
#:use-module (guix store)
#:use-module ((guix serialization) #:select (write-file))
#:use-module (guix zlib)
#:use-module (guix ui)
#:use-module (guix scripts)
- #:use-module ((guix utils) #:select (compressed-file?))
- #:use-module ((guix build utils) #:select (dump-port))
+ #:use-module ((guix utils)
+ #:select (with-atomic-file-output compressed-file?))
+ #:use-module ((guix build utils) #:select (dump-port mkdir-p))
#:export (%public-key
%private-key
@@ -70,6 +73,10 @@ Publish ~a over HTTP.\n") %store-directory)
-C, --compression[=LEVEL]
compress archives at LEVEL"))
(display (_ "
+ -c, --cache=DIRECTORY cache published items to DIRECTORY"))
+ (display (_ "
+ --workers=N use N workers to bake items"))
+ (display (_ "
--ttl=TTL announce narinfos can be cached for TTL seconds"))
(display (_ "
--nar-path=PATH use PATH as the prefix for nar URLs"))
@@ -154,6 +161,13 @@ if ITEM is already compressed."
(warning (_ "zlib support is missing; \
compression disabled~%"))
result))))))
+ (option '(#\c "cache") #t #f
+ (lambda (opt name arg result)
+ (alist-cons 'cache arg result)))
+ (option '("workers") #t #f
+ (lambda (opt name arg result)
+ (alist-cons 'workers (string->number* arg)
+ result)))
(option '("ttl") #t #f
(lambda (opt name arg result)
(let ((duration (string->duration arg)))
@@ -190,6 +204,9 @@ compression disabled~%"))
%default-gzip-compression
%no-compression))
+ ;; Default number of workers when caching is enabled.
+ (workers . ,(current-processor-count))
+
(address . ,(make-socket-address AF_INET INADDR_ANY 0))
(repl . #f)))
@@ -308,6 +325,121 @@ appropriate duration. NAR-PATH specifies the prefix for
nar URLs."
#:compression compression)
<>)))))
+(define* (nar-cache-file directory item
+ #:key (compression %no-compression))
+ (string-append directory "/"
+ (symbol->string (compression-type compression))
+ "/" (basename item) ".nar"))
+
+(define* (narinfo-cache-file directory item
+ #:key (compression %no-compression))
+ (string-append directory "/"
+ (symbol->string (compression-type compression))
+ "/" (basename item)
+ ".narinfo"))
+
+(define run-single-baker
+ (let ((baking (make-weak-value-hash-table))
+ (mutex (make-mutex)))
+ (lambda (item thunk)
+ "Run THUNK, which is supposed to bake ITEM, but make sure only one
+thread is baking ITEM at a given time."
+ (define selected?
+ (with-mutex mutex
+ (and (not (hash-ref baking item))
+ (begin
+ (hash-set! baking item (current-thread))
+ #t))))
+
+ (when selected?
+ (dynamic-wind
+ (const #t)
+ thunk
+ (lambda ()
+ (with-mutex mutex
+ (hash-remove! baking item))))))))
+
+(define-syntax-rule (single-baker item exp ...)
+ "Bake ITEM by evaluating EXP, but make sure there's only one baker for ITEM
+at a time."
+ (run-single-baker item (lambda () exp ...)))
+
+
+(define* (render-narinfo/cached store request hash
+ #:key ttl (compression %no-compression)
+ (nar-path "nar")
+ cache pool)
+ "Respond to the narinfo request for REQUEST. If the narinfo is available in
+CACHE, then send it; otherwise, return 404 and \"bake\" that nar and narinfo
+requested using POOL."
+ (let* ((item (hash-part->path store hash))
+ (compression (actual-compression item compression))
+ (cached (and (not (string-null? item))
+ (narinfo-cache-file cache item
+ #:compression compression))))
+ (cond ((string-null? item)
+ (not-found request))
+ ((file-exists? cached)
+ ;; Narinfo is in cache, send it.
+ (values `((content-type . (application/x-nix-narinfo))
+ ,@(if ttl
+ `((cache-control (max-age . ,ttl)))
+ '()))
+ (lambda (port)
+ (display (call-with-input-file cached
+ read-string)
+ port))))
+ ((valid-path? store item)
+ ;; Nothing in cache: bake the narinfo and nar in the background and
+ ;; return 404.
+ (eventually pool
+ (single-baker item
+ ;; (format #t "baking ~s~%" item)
+ (bake-narinfo+nar cache item
+ #:ttl ttl
+ #:compression compression
+ #:nar-path nar-path)))
+ (not-found request))
+ (else
+ (not-found request)))))
+
+(define* (bake-narinfo+nar cache item
+ #:key ttl (compression %no-compression)
+ (nar-path "/nar"))
+ "Write the narinfo and nar for ITEM to CACHE."
+ (let* ((compression (actual-compression item compression))
+ (nar (nar-cache-file cache item
+ #:compression compression))
+ (narinfo (narinfo-cache-file cache item
+ #:compression compression)))
+
+ (mkdir-p (dirname nar))
+ (match (compression-type compression)
+ ('gzip
+ ;; Note: the file port gets closed along with the gzip port.
+ (call-with-gzip-output-port (open-output-file (string-append nar
".tmp"))
+ (lambda (port)
+ (write-file item port))
+ #:level (compression-level compression))
+ (rename-file (string-append nar ".tmp") nar))
+ ('none
+ ;; When compression is disabled, we retrieve files directly from the
+ ;; store; no need to cache them.
+ #t))
+
+ (mkdir-p (dirname narinfo))
+ (with-atomic-file-output narinfo
+ (lambda (port)
+ ;; Open a new connection to the store. We cannot reuse the main
+ ;; thread's connection to the store since we would end up sending
+ ;; stuff concurrently on the same channel.
+ (with-store store
+ (display (narinfo-string store item
+ (%private-key)
+ #:nar-path nar-path
+ #:compression compression)
+ port))))))
+
;; XXX: Declare the 'Guix-Compression' HTTP header, which is in fact for
;; internal consumption: it allows us to pass the compression info to
;; 'http-write', as part of the workaround to <http://bugs.gnu.org/21093>.
@@ -339,6 +471,21 @@ appropriate duration. NAR-PATH specifies the prefix for
nar URLs."
store-path)
(not-found request))))
+(define* (render-nar/cached store cache request store-item
+ #:key (compression %no-compression))
+ "Respond to REQUEST with a nar for STORE-ITEM. If the nar is in CACHE,
+return it; otherwise, return 404."
+ (let ((cached (nar-cache-file cache store-item
+ #:compression compression)))
+ (if (file-exists? cached)
+ (values `((content-type . (application/octet-stream
+ (charset . "ISO-8859-1"))))
+ ;; XXX: We're not returning the actual contents, deferring
+ ;; instead to 'http-write'. This is a hack to work around
+ ;; <http://bugs.gnu.org/21093>.
+ cached)
+ (not-found request))))
+
(define (render-content-addressed-file store request
name algo hash)
"Return the content of the result of the fixed-output derivation NAME that
@@ -495,6 +642,7 @@ blocking."
(define* (make-request-handler store
#:key
+ cache pool
narinfo-ttl
(nar-path "nar")
(compression %no-compression))
@@ -515,10 +663,17 @@ blocking."
(((= extract-narinfo-hash (? string? hash)))
;; TODO: Register roots for HASH that will somehow remain for
;; NARINFO-TTL.
- (render-narinfo store request hash
- #:ttl narinfo-ttl
- #:nar-path nar-path
- #:compression compression))
+ (if cache
+ (render-narinfo/cached store request hash
+ #:cache cache
+ #:pool pool
+ #:ttl narinfo-ttl
+ #:nar-path nar-path
+ #:compression compression)
+ (render-narinfo store request hash
+ #:ttl narinfo-ttl
+ #:nar-path nar-path
+ #:compression compression)))
;; /nar/file/NAME/sha256/HASH
(("file" name "sha256" hash)
(guard (c ((invalid-base32-character? c)
@@ -534,13 +689,16 @@ blocking."
;; /nar/gzip/<store-item>
((components ... "gzip" store-item)
(if (and (nar-path? components) (zlib-available?))
- (render-nar store request store-item
- #:compression
- (match compression
- (($ <compression> 'gzip)
- compression)
- (_
- %default-gzip-compression)))
+ (let ((compression (match compression
+ (($ <compression> 'gzip)
+ compression)
+ (_
+ %default-gzip-compression))))
+ (if cache
+ (render-nar/cached store cache request store-item
+ #:compression compression)
+ (render-nar store request store-item
+ #:compression compression)))
(not-found request)))
;; /nar/<store-item>
@@ -555,8 +713,11 @@ blocking."
(define* (run-publish-server socket store
#:key (compression %no-compression)
- (nar-path "nar") narinfo-ttl)
+ (nar-path "nar") narinfo-ttl
+ cache pool)
(run-server (make-request-handler store
+ #:cache cache
+ #:pool pool
#:nar-path nar-path
#:narinfo-ttl narinfo-ttl
#:compression compression)
@@ -606,6 +767,8 @@ blocking."
(socket (open-server-socket address))
(nar-path (assoc-ref opts 'nar-path))
(repl-port (assoc-ref opts 'repl))
+ (cache (assoc-ref opts 'cache))
+ (workers (assoc-ref opts 'workers))
;; Read the key right away so that (1) we fail early on if we can't
;; access them, and (2) we can then drop privileges.
@@ -631,6 +794,12 @@ consider using the '--user' option!~%")))
(repl:spawn-server (repl:make-tcp-server-socket #:port repl-port)))
(with-store store
(run-publish-server socket store
+ #:cache cache
+ #:pool (and cache (make-pool workers))
#:nar-path nar-path
#:compression compression
#:narinfo-ttl ttl))))))
+
+;;; Local Variables:
+;;; eval: (put 'single-baker 'scheme-indent-function 1)
+;;; End:
diff --git a/tests/publish.scm b/tests/publish.scm
index ea0f4a3..233b71c 100644
--- a/tests/publish.scm
+++ b/tests/publish.scm
@@ -314,4 +314,58 @@ References: ~%"
(call-with-input-string "" port-sha256))))))
(response-code (http-get uri))))
+(unless (zlib-available?)
+ (test-skip 1))
+(test-equal "with cache"
+ (list #t
+ `(("StorePath" . ,%item)
+ ("URL" . ,(string-append "nar/gzip/" (basename %item)))
+ ("Compression" . "gzip"))
+ 200 ;nar/gzip/…
+ #t ;Content-Length
+ 200) ;nar/…
+ (call-with-temporary-directory
+ (lambda (cache)
+ (define (wait-for-file file)
+ (let loop ((i 20))
+ (or (file-exists? file)
+ (begin
+ (pk 'wait-for-file file)
+ (sleep 1)
+ (loop (- i 1))))))
+
+ (let ((thread (with-separate-output-ports
+ (call-with-new-thread
+ (lambda ()
+ (guix-publish "--port=6797" "-C2"
+ (string-append "--cache=" cache)))))))
+ (wait-until-ready 6797)
+ (let* ((base "http://localhost:6797/")
+ (part (store-path-hash-part %item))
+ (url (string-append base part ".narinfo"))
+ (nar-url (string-append base "/nar/gzip/" (basename %item)))
+ (cached (string-append cache "/gzip/" (basename %item)
+ ".narinfo"))
+ (nar (string-append cache "/gzip/"
+ (basename %item) ".nar"))
+ (response (http-get url)))
+ (and (= 404 (response-code response))
+ (wait-for-file cached)
+ (let ((body (http-get-port url))
+ (compressed (http-get nar-url))
+ (uncompressed (http-get (string-append base "nar/"
+ (basename %item)))))
+ (list (file-exists? nar)
+ (filter (lambda (item)
+ (match item
+ (("Compression" . _) #t)
+ (("StorePath" . _) #t)
+ (("URL" . _) #t)
+ (_ #f)))
+ (recutils->alist body))
+ (response-code compressed)
+ (= (response-content-length compressed)
+ (stat:size (stat nar)))
+ (response-code uncompressed)))))))))
+
(test-end "publish")
- branch master updated (54412eb -> d72b420), Ludovic Courtès, 2017/04/18
- 02/06: publish: Introduce 'actual-compression'., Ludovic Courtès, 2017/04/18
- 03/06: publish: Use 'sendfile' when possible., Ludovic Courtès, 2017/04/18
- 06/06: publish: Remove expired cache entries when '--ttl' is used., Ludovic Courtès, 2017/04/18
- 01/06: Add (guix workers)., Ludovic Courtès, 2017/04/18
- 05/06: Add (guix cache) and use it in (guix scripts substitute)., Ludovic Courtès, 2017/04/18
- 04/06: publish: Add '--cache' and '--workers'.,
Ludovic Courtès <=