[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Proposition to streamline our NAR collection to just zstd-compressed
From: |
Efraim Flashner |
Subject: |
Re: Proposition to streamline our NAR collection to just zstd-compressed ones |
Date: |
Mon, 15 Jan 2024 10:31:07 +0200 |
On Wed, Jan 10, 2024 at 12:36:51PM +0100, Ludovic Courtès wrote:
> Hello,
>
> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>
> > It's been on my head for quite a bit of time (about 2 years, according
> > to [0]), to streamline our offering of cached nars. Letting go of gzip
> > 2 years ago, along a more aggressive garbage collection policy allowed
> > us to reduce our storage needs by at least 6.5 TiB. I'm proposing to do
> > the same with our lzip compressed nars, to let go of an additional 3.9
> > TiB:
>
> Those space savings would be welcome.
>
> > The above suggests that zstd compressed nars are about 5% larger than
> > the lzip ones, which is not big enough to justify carrying both, in my
> > opinion. In exchange for a little bit more bandwidth, users would have
> > the nars decompressed much faster with less CPU overhead locally.
>
> The difference is slightly higher, with lzip being 8% smaller, for a big
> package like ungoogled-chromium or icecat:
>
> --8<---------------cut here---------------start------------->8---
> $ wget -qO-
> https://ci.guix.gnu.org/7n95j1zlnwzc44azjs7nj8givnzdfs87.narinfo|grep -B1
> ^FileSize
> Compression: lzip
> FileSize: 85783483
> --
> Compression: zstd
> FileSize: 92796393
> $ wget -qO-
> https://ci.guix.gnu.org/prpjnnnhay0alanmkgjh66vfwjlb98kq.narinfo|grep -B1
> ^FileSize
> Compression: lzip
> FileSize: 295991
> --
> Compression: zstd
> FileSize: 323456
> --8<---------------cut here---------------end--------------->8---
>
> But yeah, even though adaptive compression selection on the client is a
> minor improvement, whether it warrants the extra space is debatable.
There's another zstd flag that we should probably add: --rsyncable.
--rsyncable: zstd will periodically synchronize the compression state to
make the compressed file more rsync-friendly. There is a negligible
impact to compression ratio, and a potential impact to compression
speed, perceptible at higher speeds, for example when combining
--rsyncable with many parallel worker threads. This feature does
not work with --single-thread. You probably don´t want to use it with
long range mode, since it will decrease the effectiveness of the
synchronization points, but your mileage may vary.
> > What do you think? Should we go ahead and effect the following simple
> > change for the Berlin build farm?
> >
> > modified hydra/modules/sysadmin/services.scm
> > @@ -683,7 +683,7 @@ to a selected directory.")
> > ;;
> > <https://lists.gnu.org/archive/html/guix-devel/2021-01/msg00097.html>
> > ;; for the compression ratio/decompression speed
> > ;; tradeoffs.
> > - (compression '(("lzip" 9) ("zstd" 19)))
> > + (compression '(("zstd" 19)))
>
> No objection from me, but…
>
> … an important consideration: zstd support was added in 1.3.0, released
> in May 2021.
>
> From experience we know that users on foreign distros rarely, if ever,
> upgrade the daemon (on top of that, upgrading the daemon is non-trivial
> to someone who initially installed the Debian package, from what I’ve
> seen, because one needs to fiddle with the .service file to adjust file
> names and the likes), and we can be sure that many are still running an
> old daemon. We spent a lot of time on user support after gzip
> substitutes had been removed (‘guix substitute’ would just crash) and we
> must avoid that.
>
> (guix store) emits a warning when connecting to an “old” daemon, but
> only for daemons older than 2018. We could emit a warning based on
> whether or not “builtin:git-download” is available, but maybe that’s too
> early?
builtin:git-download sometimes bites me on my machines since I don't
upgrade my aarch64/riscv64 installs that often.
Also, 2018 is now about 5 years ago. It might be a good idea to just
have a rolling YEAR-3 warning that the daemon is getting old and they
might be missing out on features present in newer daemon versions.
> In addition to the warning, we should communicate in advance and make
> sure our instructions on how to upgrade the daemon are accurate and
> clear.
>
> Thoughts?
>
> Ludo’.
>
--
Efraim Flashner <efraim@flashner.co.il> רנשלפ םירפא
GPG key = A28B F40C 3E55 1372 662D 14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted
signature.asc
Description: PGP signature