[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Monotone-devel] CVS sync works (for me)
From: |
Nathaniel Smith |
Subject: |
Re: [Monotone-devel] CVS sync works (for me) |
Date: |
Mon, 21 Feb 2005 10:36:40 -0800 |
User-agent: |
Mutt/1.5.6+20040907i |
On Mon, Feb 21, 2005 at 03:08:53PM +0100, Christof Petig wrote:
> you see that for larger projects only a few of the files change per
> revision (edge). But such a cert can easily grow to 1817 lines and 31.5k
> bytes. If you multiply that with the amount of edges (3300 for this
> project) you get about 90MB! So I decided to store only the changing
> files like:
>
> cvs.midgard.berlios.de:/cvsroot/midgard/midgard (repository)
> +ebf337072571135affe49b5da42b7342ddba0852 (last revision)
> - dir/deleted
> 1.5 dir/changed
> 1.1 dir/added
Okay, I think this is on the same track as what I meant. What I was
trying to point out was that if you know just one change that happened
in a single commit -- like, dir/changed went from 1.4 to 1.5 -- then
that should be enough to identify that entire commit. (After all,
dir/changed only goes 1.4 -> 1.5 once, ever.) Basically by looking up
when that happened, noting the time/changelog/etc., and then using
that to assemble the other changes since then.
> This should get the size down to a reasonable amount and is readable
> enough to be able to verify by sight. [That's actually the reason I
> refrained from reverse diffing (store the last cert in full length and
> recode older ones as time-backwards-diffs)]. I have to read and process
> all the certs anyway.
Hmm, that doesn't sound very scalable. Why do you have to do that?
> The reason I need older certs as well is to enable the correct rooting
> of branches (once supported).
Right.
> PS: Ever thought about putting an index on revision_certs.id? Perhaps
> this speeds up correctness verification (just guessing) and since data
> retrieval is more likely than data modification I cannot see drawbacks.
> Similar might apply to other large (number of rows) tables as well.
> [e.g. manifest_deltas.id, file_deltas.id]
Hmm, might be a good idea -- do you have some test case where it
matters? Maybe "log" or something?
If sqlite is like other rdbms's, I'd think that we already have
indices on {manifest,file}_deltas.id, because usually unique
constraints generate implicit indicies. Could be wrong, though.
-- Nathaniel
--
"On arrival in my ward I was immediately served with lunch. `This is
what you ordered yesterday.' I pointed out that I had just arrived,
only to be told: `This is what your bed ordered.'"
-- Letter to the Editor, The Times, September 2000