[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Gnu-arch-users] Re: situations where cached revisions are not so go
From: |
Jason McCarty |
Subject: |
Re: [Gnu-arch-users] Re: situations where cached revisions are not so good |
Date: |
Sun, 28 Sep 2003 01:56:41 -0400 |
User-agent: |
Mutt/1.5.4i |
Tom Lord wrote:
> > I don't make use of cachedrevs, but they could be
> > integrated if desired (for example if a revision is really huge, a
> > cachedrev might be better than summaries).
> >
> > Implementation of arch_build_revision(ALMOST, TARGET, DESTDIR):
> > 1) If ALMOST is nil:
> > 2) Try to find a pristine or libraryrev ALMOST which is the
> > greatest lower bound of TARGET, with the same version as TARGET.
> > Else:
> > 3) Set ALMOST to TARGET's base-0.
>
> I sort of follow you here. `versions' (in the arch sense of logical
> development lines) don't come into play directly. We have to talk
> about `ancestry'.
As a useful first approximation, my algorithm assumes that the entire
ancestry resides within one version. If this assumption is incorrect, it
will be changed in step 8, when a continuation from another version is
reached. I believe this will follow the ancestry correctly.
I should have explained my least upper bound (lub) and greatest lower
bound (glb) operations more completely, as I think much of your
criticism stems from not catching my meaning. Because of the assumption
that the whole ancestry is in the same version, the lub/glb calculations
become almost trivial; they are just making comparisons between integers
(although comparing the versionfix versions needs an adjustment).
So if GOAL is a--b--1.0--patch-39 and we have pristines for
a--b--1.0--patch-{13,17,21,52}, then glb(GOAL) = a--b--1.0--patch-21,
since 21 <= 39 and 21 >= 13 and 17. So a--b--1.0--patch-21 would be
ALMOST (even though it may not be correct, it will be fixed later).
Similarly, with the same ALMOST and GOAL, if there are summaries from
a--b--1.0--patch-{12,23,27} to GOAL (they would be in GOAL's directory)
then lub(ALMOST) = a--b--1.0--patch-23, since 23 >= 21 and 22 <= 27. So
DELTA would be summary-23-39, and ORIG would be a--b--1.0--patch-23.
Even if there is a continuation from another version between patch-23
and patch-39, it doesn't matter, because the summary is an exact delta
between them, and we have patch-23 in full. Granted, the summary may be
bloated if the continuation caused much change from the previous
revisions.
> We don't know, a priori, the entire ancestry of GOAL. To find it
> out, we have to walk it backwards, querying an archive each step of
> the way.
>
> So, we can't pick the closest ancester ALMOST without some server
> round-trips here -- but that's OK, because we'll need to make those
> very same round-trips no matter what.
Which is why I make an initial assumption and correct it as necessary
while walking the tree backwards.
> There is a slight problem with your approach. We can't count on
> access to the oldest ancestor. It may very well be in some archive
> that we don't have access to. We pretty much _have_ to be willing to
> set ALMOST to an archive-cached revision.
>
> Another problem is that the oldest ancestor may be very far away
> compared to the nearest archive-cached ancestor -- another reason we
> can not ignore archive cached revisions.
That's possible. I treated cachedrevs as an afterthought when I should
have implemented them fully. In step 6, if a cachedrev for GOAL existed
you could retrieve it and return. I think it might be advantageous to
have a flag in ~/.arch-params/=locations for each archive that stated
whether to utilize cachedrevs in arch_build_revision for that archive.
> (Oh, and, just as a point of amusing trivia: unpacking a _locally_
> archive-cached revision is notably faster than copying that revision
> from a revision library.)
I didn't know that. The current algorithm also uses revlibs in
preference to cachedrevs, doesn't it?
> > 4) If ALMOST == TARGET, copy ALMOST to DESTDIR and return (may have to
> > call arch_build_revision if it's a continuation, or connect to
> > archive to grab import).
>
> Sure, if ALMOST is the GOAL and we've managed to find a copy of ALMOST
> -- we're done.
>
>
> > 5) Connect to TARGET's archive.
> > 6) Find a DELTA to TARGET whose FROM endpoint is the least upper bound
> > of ALMOST (this will be either a revision or a summary delta in
> > TARGET's directory). Continuations count also; let FROM be the
> > revision it's a continuation of. Use a continuation from another
> > version if it's all that's there.
> > 7) If DELTA is a continuation from a different version than TARGET:
> > 8) Call arch_build_revision(0, FROM, DESTDIR).
> > Else call arch_build_revision(ALMOST, FROM, DESTDIR)
> > 9) Retrieve and apply DELTA to DESTDIR.
>
> Ok, assuming that I follow you: the oversimplification in what
> you're saying is that you can not ignore archive cached revisions.
> You can't count on searching way far back in ancestry. Searching
> back in ancestry requires one round-trip per step. If the summary
> delta you find is too far back relative to the next nearest cachedrev,
> this is a pessimizing algorithm. etc.
Part of the point of summaries (at least in my implementation) is that
you don't have to search the ancestry revision-by-revision. You skip
large parts of it, which _should_ actually result in fewer round-trips.
Knowledge of the complete ancestry is discovered as we go, and not all
of the ancestry need be examined.
So it's not any more pessimizing than cachedrevs currently are.
> It's a graph searching problem. We have the ancestry graph which is
> just a linear, directed graph. Now you're adding these summary-delta
> links to the ancestry graph so we have things like:
I utilize the ALMOST variable to force which path will be followed, so
we effectively still have a linear graph.
> GOAL --> GOAL-ancestor --> GOAL-ancestor^2 .... -> GOAL-ancestor^N ...
> \ ^
> \ /
> `---------------------------'
>
> Each node on the graph is a fully qualified revision name. For a
> given revision name, we can cheaply ask: "do I have a pristine?" and
> "do I have a rev lib entry?"
>
> For the cost of a server roundtrip, we can ask: "is it archive
> cached?" and "is there a summary delta here" and "what is the name of
> the next ancestor?"
>
> Initially, we know only "GOAL".
>
> We can build GOAL by walking backwards along this graph until we find
> a sufficient collection of bits and pieces to assemble GOAL. With
> summary-deltas, and a more complicated build_rev algorithm, now we
> have multiple paths we might want to explore.
More complicated algorithms could certainly put forth lots more effort
to follow the graph in the most efficient way, yes. I've chosen a simple
algorithm to avoid doing complicated graph searches.
> One difficulty of your algorithm, if I'm following it correctly, is
> these "least upper bound" computations -- they assume more knowledge
> of the ancestry graph than we have on hand.
I think I've shown above that they are trivial to calculate, and that
they eventually get corrected in step 8 if they're wrong.
> Your algorithm isn't _wrong_, by a long stretch -- it's just missing
> consideration of archive-cached revisions and missing a bunch of
> important details. (So, keep going? Maybe this is a good time at
> which to start playing with the code.)
I should give cachedrevs more consideration, although I don't think
there's a huge amount of detail to add than what I wrote in this
message. I _wish_ I had time to try implementing it, but I've spent far
too much time thinking about it already. Perhaps next weekend I'll be
able to attempt an implementation.
> > As you can see from my algorithm, the amount of extra code is not that
> > large.
>
> I don't see that yet. Might be true -- but I don't see it yet.
That was hand waving on my part, sorry. We'll both find out if/when I
write some code.
Jason
- [Gnu-arch-users] Re: situations where cached revisions are not so good, (continued)
- [Gnu-arch-users] Re: situations where cached revisions are not so good, Miles Bader, 2003/09/24
- [Gnu-arch-users] Re: situations where cached revisions are not so good, Stig Brautaset, 2003/09/24
- [Gnu-arch-users] Re: situations where cached revisions are not so good, Miles Bader, 2003/09/24
- Re: [Gnu-arch-users] Re: situations where cached revisions are not so good, Jason McCarty, 2003/09/24
- Re: [Gnu-arch-users] Re: situations where cached revisions are not so good, Jason McCarty, 2003/09/24
- [Gnu-arch-users] Re: situations where cached revisions are not so good, Miles Bader, 2003/09/26
- Re: [Gnu-arch-users] Re: situations where cached revisions are not so good, Jason McCarty, 2003/09/27
- Re: [Gnu-arch-users] Re: situations where cached revisions are not so good, Tom Lord, 2003/09/27
- Re: [Gnu-arch-users] Re: situations where cached revisions are not so good, Jason McCarty, 2003/09/27
- Re: [Gnu-arch-users] Re: situations where cached revisions are not so good, Tom Lord, 2003/09/27
- Re: [Gnu-arch-users] Re: situations where cached revisions are not so good,
Jason McCarty <=
- Re: [Gnu-arch-users] Re: situations where cached revisions are not so good, Robert Collins, 2003/09/28
- Re: [Gnu-arch-users] Re: situations where cached revisions are not so good, Tom Lord, 2003/09/28
- Re: [Gnu-arch-users] Re: situations where cached revisions are not so good, Andrew Suffield, 2003/09/28
- Re: [Gnu-arch-users] Re: situations where cached revisions are not so good, Tom Lord, 2003/09/28
- Re: [Gnu-arch-users] Re: situations where cached revisions are not so good, Robert Collins, 2003/09/28
- Re: [Gnu-arch-users] Re: situations where cached revisions are not so good, Tom Lord, 2003/09/28
- Re: [Gnu-arch-users] Re: situations where cached revisions are not so good, Tom Lord, 2003/09/28
- Re: [Gnu-arch-users] Re: situations where cached revisions are not so good, Jason McCarty, 2003/09/28
- Re: [Gnu-arch-users] Re: situations where cached revisions are not so good, Tom Lord, 2003/09/28
- Re: [Gnu-arch-users] Re: situations where cached revisions are not so good, Robert Collins, 2003/09/28