gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Storage efficiency of revlibs


From: Ludovic Courtès
Subject: Re: [Gnu-arch-users] Storage efficiency of revlibs
Date: Tue, 13 Dec 2005 14:17:28 +0100
User-agent: Gnus/5.110004 (No Gnus v0.4) Emacs/21.4 (gnu/linux)

Hi,

Mikhael Goikhman <address@hidden> writes:

> Nope, this would be too trivial to mean. I meant that you possibly came
> to the conclusion too early. You said your revlib compression ratio is 4
> or 8. I asked you to show a sample ratio of cacherev of patch-300 versus
> revlib tree of the same revision. So we may compare these two ratios and
> judge whether it is really correct to conclude for your project that
> "revlib consumes slightly less disk than cacherev per _every_ revision".

I believe you're misunderstanding me.  In my previous post, I said:

  Well, there's no such thing as a "one size fits all" solution.  You
  have to make tradeoffs.  [...] Furthermore, which compression
  technique works best is highly dependent on the project you're working
  on.

In particular, I found the revlib technique to be quite storage
efficient _on the particular project I was interested in_ (i.e., a
project where only a small fraction of files is touched in between
revisions).  I do agree that there other cases where the revlib
technique is _not_ space-efficient at all, especially compared to
tar+gz.

> If anyone still thinks that the current arch revlib is good to be defined
> by default for everyone, he may repeat this experiment.

I personally don't think it should be a default, precisely because it is
not always a good thing to have.

> Start with empty greedy revlib (sparse or not). Touch 1000 empty files
> and import them to a local archive. Then repeat this in a loop:

I agree: there are situations where revlibs are _not_ space-efficient.

My point is that there are also not-so-uncommon situations where revlibs
are quite space-efficient.  Not making it a default allows people to
decide by themselves whether it's a suitable approach for their
projects.  Of course, it would help to document the exact tradeoffs and
situations where revlibs may or may not be efficient so that users can
make the best choice for them.

IMO, complementing revlibs with per-file gzip would yield a
storage-efficiency improvement for a _large number_ (not _all_)
projects.  OTOH, it would be more CPU-intensive and it disallows
`build-config --link' and the likes, so it's certainly not advisable for
everyone.  Yet another tradeoff.

Thanks,
Ludovic.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]