guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How many bytes do we add (closure of guix) when adding one new packa


From: Csepp
Subject: Re: How many bytes do we add (closure of guix) when adding one new package?
Date: Tue, 30 May 2023 21:10:07 +0200

Simon Tournier <zimon.toutoune@gmail.com> writes:

> Hi,
>
> On ven., 26 mai 2023 at 18:21, Ludovic Courtès <ludo@gnu.org> wrote:
>
>> I agree that .go files are quite big (.scm files as well, but we’ve
>> improved information density somewhat by removing input labels :-)).
>>
>> The size of .go files went down when we switch to the baseline compiler
>> (aka. -O1):
>>
>>   https://lists.gnu.org/archive/html/guix-devel/2020-06/msg00071.html
>>
>> That thread has ideas of things to do to further reduce .go size.
>
> Just to put a figure on what means “big”: currently the .go files are 5
> times bigger than their associated .scm.
>
> Somehow, it’s the trap of DSL. :-) Packages are declarative and the
> information they declare is not dense.  However, because they are
> bytecompiled to a general programming language, their specificity is not
> exploited.  In an ideal world, the compiled binary representation of the
> packages should be smaller than their human-readable text-file
> counterpart.
>
> The mentioned improvement is nice.  And it’s visible:
>
> --8<---------------cut here---------------start------------->8---
> 145M 
> /gnu/store/nqrb3g4l59wd74w8mr9v0b992bj2sd1w-guix-d62c9b267-modules/lib/guile/3.0/site-ccache/gnu
> 117M 
> /gnu/store/s6rqlhqr750k44ynkqqj5mwjj2cs2yln-guix-a09968565-modules/lib/guile/3.0/site-ccache/gnu
> 127M 
> /gnu/store/ndii4bpyzh2rc05ya61s89rig9hdrl4k-guix-a0178d34f-modules/lib/guile/3.0/site-ccache/gnu
> 164M 
> /gnu/store/ni63a203jf61dwxlv8kr9b8x3vb1pdsp-guix-8e2f32cee-modules/lib/guile/3.0/site-ccache/gnu
> --8<---------------cut here---------------end--------------->8---
>
> However, it has almost no impact on the whole size; scaled by the number
> of packages.
>
>> Download size has to be treated separately though.  For example, ‘git
>> pull’ doesn’t redownload all of the repo or directory, and it uses
>> compression heavily.  Thus, a few hundred bytes of additional .scm text
>> translate in less than that.
>>
>> As for the rest, download size can be reduced for example by choosing a
>> content-address transport, like something based on ERIS.
>>
>> I think we must look precisely at what we want to optimize—on-disk size,
>> or bandwidth requirement, in particular—and look at the whole solution
>> space.
>
> I think one direction is to tackle the way *package-modules* is built.
> Because of that, Guix is building too much and the design is not optimal
> – whatever technical solutions we implement for improving after that.
>
> On my poor laptop, Guix is becoming unusable because many operations are
> becoming so slow – when it’s still acceptable with APT of Debian.  For
> instance, it’s something like 20 minutes for running “guix pull” without
> substitutes.  And when I am traveling without a fast Internet
> connection, it’s often too much for the network at hand.
>
> Currently, “guix pull” is either building too much and downloading too
> much; by design.
>
>
> Cheers,
> simon

Something I've been considering is if Guix could make use of database
optimizations on its packages.  Having access to Scheme for everything
is nice, but using it as a storage solution is kind of silly when we are
mostly just storing structs.  Some kind of struct-of-arrays optimization
could definitely reduce their size by a lot, might even speed up some
operations.  It makes zero sense to load full package definitions from
disk for most queries, such as guix search, with an SoA representation
we could load only the fields that we care about.

ps.: Now I'm even more glad that I'm using a file system with
transparent compression on all my Guix systems.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]