guix-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug#50384] [PATCH v3] Optimise search-patch (reducing I/O)


From: Maxime Devos
Subject: [bug#50384] [PATCH v3] Optimise search-patch (reducing I/O)
Date: Thu, 09 Sep 2021 22:25:46 +0200
User-agent: Evolution 3.34.2

Hi guix,

This is a v3, without the base16 and base32 optimisations which
are split-off into <https://issues.guix.gnu.org/50456>.  It doesn't
seem this patch series will bring improvements, but feel free to test
(in particular, I wonder if this will help people using a remote daemon,
where transmitting data can take (relatively) long?).

(guix scripts hash) is broken, which would need to be fixed in the final
version, if any.  Ludovic has some concerns about dependency tracking in
search-patch which need to be adressed.

I think a more fruitful goal is to somehow parallelize the derivation
computation, with multiple separate connections to the store, such that
if one connection is blocking, the other one can be used for something
separate (threads aren't necessary if current-read-waiter,
current-write-waiter and non-blocking I/O are used).

Now, what improvements does this version of the patch series bring?
(Make sure to start the daemon with ./pre-inst-env guix daemon ...,
and set --localstatedir=/var!  Some changes to the daemon were made.)

1.  RPC count (tested in a local checkout)

    After the patch series:
make && GUIX_PROFILING=rpc ./pre-inst-env guix build -d pigx --no-grafts
accepted connection from pid 4917, user [USER]

/gnu/store/jfjfg7dnis7v6947a0rncxdn3y1nz0ad-pigx-0.0.3.drv
Remote procedure call summary: 5754 RPCs
  built-in-builders              ...     1
  add-to-store                   ...     3
  add-to-store/tree              ...    26
  add-temp-root-and-valid-path?  ...   195
  add-text-to-store              ...  5529

  After the patch series, with (if sha256 ...) replaced with (if #f ...)
  in (guix gexp), to simulate the situation before the patch series

/gnu/store/jfjfg7dnis7v6947a0rncxdn3y1nz0ad-pigx-0.0.3.drv
Remote procedure call summary: 5749 RPCs
  built-in-builders              ...     1
  add-to-store/tree              ...    26
  add-to-store                   ...   193
  add-text-to-store              ...  5529

(add-to-store RPCs are converted to add-temp-root-and-valid-path? RPCs)

2. Timing

   First do
         echo powersave | sudo tee 
/sys/devices/system/cpu/cpu{0,1,2,3}/cpufreq/scaling_governor
   to eliminate CPU frequency scaling effects.
   To automatically repeat the tests and compute the standard deviation,
   'hyperfine' is used:
   
HYP=/gnu/store/3ya4iw6fzq1ns73bv1g3a96jvwhbv60c-hyperfine-1.11.0/bin/hyperfine

   To determine the effect of the change to 'local-file-compiler' and
   'search-patch' and nothing else, I will compare the performance of guix
   after the patch series with the performance of guix after the patch series
   and 'sha256' replaced by #false.

   With #f, --runs=60:
   make && ./pre-inst-env $HYP --runs=60 --warmup 1 -- 'guix build -d pigx 
--no-grafts'
   Time (mean ± σ):     15.428 s ±  0.385 s    [User: 15.925 s, System: 0.652 s]
   Range (min … max):   14.768 s … 16.550 s    60 runs

   With sha256, --runs=60
   make && ./pre-inst-env $HYP --runs=60 --warmup 1 -- 'guix build -d pigx 
--no-grafts'
   Time (mean ± σ):     15.493 s ±  0.252 s    [User: 15.585 s, System: 0.680 s]
   Range (min … max):   14.981 s … 16.294 s    60 runs

  These numbers don't have a clear difference.  Maybe statistics can help?   
First,
  formulate a null-hypothesis.  As the total number of RPCs didn't change, the 
amount
  of data sent to the daemon is reduced and some "stats", "open" and "reads" 
are avoided,
  I would expect that the mean decreases.  Thus, as null-hypothesis, I choose:

  H0: the (theoretical) mean for ‘with sha256’ is less than the mean for ‘with 
#f’

  In the timing tests, the observed mean for 'with sha256’ is actually larger.
  But is this significant?

  guix environment --ad-hoc r
  before.mean   = 15.428
  before.stddev = 0.385
  after.mean    = 15.493
  after.stddev  = 0.252
  samples = 60

  # ‘statistical’ crate used by hyperfine
  # performs N/(N-1) correction XXX

  t = (before.mean - after.mean)/(sqrt(samples) * sqrt(before.stddev^2 + 
after.stddev^2))
  v = (samples - 1) * (before.stddev^2 + after.stddev^2)^2/(before.stddev^4 + 
after.stddev^4)

  q = dt(-t, v); q
  # p-value: 0.5072571
  # Null-hypothesis is not rejected

  It's not rejected, though that doesn't prove much since t is almost zero,
  so this test cannot reject the hypothesis ‘the means are equal’ or ‘the patch
  series makes things slower’ either.

  I don't think this patch series helps on my laptop (at least on a hot disk 
cache, I'd have
  to check for a cold cache).  However, I wonder if this would help a little 
for people
  using a remote build daemon (with a nfs setup or something) (see 
GUIX_DAEMON_SOCKET)?

Greetings,
Maxime.

Attachment: 0001-build-self-Implement-basic-hash-algorithm.patch
Description: Text Data

Attachment: 0002-guix-hash-Extract-file-hashing-procedures.patch
Description: Text Data

Attachment: 0003-store-Define-new-add-temp-root-and-valid-path-operat.patch
Description: Text Data

Attachment: 0004-store-Add-compatibility-fall-back-for-add-temp-root-.patch
Description: Text Data

Attachment: 0005-gexp-Allow-computing-the-hash-of-the-local-file-in-a.patch
Description: Text Data

Attachment: 0006-gexp-Allow-overriding-the-absolute-file-name.patch
Description: Text Data

Attachment: 0007-packages-Compute-the-hash-of-patches-in-advance-when.patch
Description: Text Data

Attachment: 0008-compile-all-compile-Keep-track-of-dependencies-of-co.patch
Description: Text Data

Attachment: 0009-packages-Add-patches-to-the-dependency-list-of-packa.patch
Description: Text Data

Attachment: 0010-gexp-Do-not-intern-if-the-file-is-already-in-the-sto.patch
Description: Text Data

Attachment: signature.asc
Description: This is a digitally signed message part


reply via email to

[Prev in Thread] Current Thread [Next in Thread]