help-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Guix Docker image inflation


From: Stephen Scheck
Subject: Re: Guix Docker image inflation
Date: Sat, 30 May 2020 13:02:02 -0400

On Fri, May 29, 2020 at 7:31 PM Chris Marusich <cmmarusich@gmail.com> wrote:

>
> Could it be that you are accumulating layers without bound?
>
>
> https://developers.redhat.com/blog/2016/03/09/more-about-docker-images-size/
>
> Since Docker images are built up of immutable layers, if you build your
> image from an existing base image, I'm not sure that it's possible to
> produce a new image that is smaller than the base image.  Basically,
> even if you run "guix gc" to remove dead store items, they will still
> exist on a prior layer, so the size of the new image won't decrease.
> And since you're installing new things, the size will actually increase.
> If you repeat this process by using the new image as an input for yet
> another build, I think you will accumulate layers and storage space
> without bound.
>

Layers certainly add some image size overhead, but I don't think that is
the culprit
here. And producing a smaller image isn't really the goal, it's just to
keep image
growth reasonable between each incremental guix pull. Dead store items would
only exist on previous layers if they make it there in the first place. As
has been
demonstrated on previous posts in the thread, I believe the problem is some
guix bug which prevents deletion of garbage-collected store items.

What is reasonable growth? That is hard to answer, but I would expect it be
roughly
proportional to the growth of a guix installation over time in a non-Docker
environment,
taking some constant amount of layer overhead as a given.

I don't really know what `guix pull` does, but I think it's something along
these lines:
1) the global package index is brought up-to-date; 2) Any packages which
are installed
in the profile doing the pull are upgraded to newer versions if they've
been updated. So
day-to-day, particularly in the case where there have been no updates to
packages
installed in the profile, size growth should be very small. Periodic
"rebasing" of incremental
Docker images might still be helpful from time to time using one of the
layer squashing
tools out there, but I don't think it should be necessary on a daily basis.

Also, layers are helpful in the case of someone pulling down daily Guix
Docker images
on a frequent basis, because then only the new, ideally small layers need
to be downloaded,
whereas if you rebase for every image build, you'd have to download the
entire image
every day.

The boundless layer accumulation you point out shouldn't be a problem with
the way that
I'm building the images. When you do a `RUN <command>` inside a Dockerfile,
it is essentially
doing `docker exec <container> <command>` followed by `docker commit
<container>`. It is
the commit step which produces a new layer. You can think of a RUN command
inside a Dockerfile
as kind of a single-step transaction, which incorporates the net file
system changes into the image.

My build script issues several `docker exec <container> <command>`
sequences, followed by a
`docker commit <container>`. Intermediate changes to the container file
system prior to the commit
do not generate layers, only the net changes after the commit.

You can convince yourself of this by doing something like the following:

    docker run <some-linux-image>
    docker exec <container-id> dd if=/dev/urandom of=/RANDOM-DATA
bs=1048576 count=1024
    docker commit <container-id>
    docker exec <container-id> rm /RANDOM-DATA
    docker commit <container-id>

You'll end up with two new images - the first one should be about 1 GB
larger than the base image,
the second one the same size.

FYI, Guix itself can build Docker images from scratch - no base image
> required!  It can even build a Docker image of a full-blown Guix System
> from scratch.  Sorry if you already knew that - I just wanted to point
> it out in case you didn't!
>

Yes, thanks, I know - if you read through the thread you'll see that I make
reference to  `guix system docker-image [...]`.

-SS


reply via email to

[Prev in Thread] Current Thread [Next in Thread]