Re: [Monotone-devel] README library list and log command fixes

monotone-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] README library list and log command fixes

From:	graydon hoare
Subject:	Re: [Monotone-devel] README library list and log command fixes
Date:	20 Oct 2003 00:20:20 -0400
User-agent:	Gnus/5.09 (Gnus v5.9.0) Emacs/21.2

Nathaniel Smith <address@hidden> writes:

> In general, we can't rely on fetching from www.off.net to fill in the
> blanks; he could just as well promoted his version from a private
> repository or something.

I agree. it is a blunt goal, with depots, to make it possible to track
a development effort, as seen by some developer, strictly by watching
*their* depot. not the whole group's depots.

> Though, question: what does happen with the solution below if I have
> 
>  <stuff> -> A1 --> B1 -> B2
>               \         /
>                `-> A2 -'
> 
> and I commit B2 to my depot "B"?  we do a cert against B1, obviously,
> and a cert against A2, obviously... but do we then follow A2's
> ancestry up until we reach another revision in B?

nope. at least, not now. I don't think we should. do you?

currently, the least ancestor P of B2 which we find listed in B, we
use as the base for the delta we post. P -> B2 is posted both as a new
delta and as a new ancestry cert, generated by us. we also post
(unmodified) copies of all certs found in our database on B2.

the change I'm considering, to address your previous concern, is to
post all the ancestry certs (and possibly all the deltas) between P
and B2. since the aggregation of edges can also be seen as a
"feature", I'll probably put this under the control of a hook. I am
not suggesting chasing any other incoming edges to B2. if B1 is in B,
then P == B1 and that's as far back as we'll go.

> Just to check -- the LCA algorithm knows to ignore these redundant
> edges, right?  I don't have time to come up with an example now, but
> it seems likely they could wreak some havoc with the way they shorten
> paths.  Or can that not happen?

sharp eyes again. here is the concrete case:

alice and bob fork at A, producing B and C:

   A --- B  (alice)
    \
     `-- C  (bob)

then bob fetches alice's edge B and updates, placing his own changes C
at the end of A -> B:

   A --- B       (alice)
    \
     `-- B -- C  (bob)

when bob posts, he posts the edge A -> C, since his depot doesn't have
B. now C has 2 parents, so far as someone watching bob knows:

   A --- B         (alice)
    \
     \--------\
      \        \
       `-- B -- C  (bob)

so LCA(C, <foo>) will see C having 2 parents and recursively skip up
to A, ignoring B. to see this in action, suppose alice makes another
version D:

   A --- B --- D       (alice)
    \
     \--------\
      \        \
       `-- B -- C      (bob)

and bob fetches and updates, pulling it in:

   A --- B --- D            (alice)
    \
     \--------\
      \        \
       `-- B -- C           (bob)
            \
             `-- D

now when bob runs "monotone merge", LCA(C,D) will find A, not B, due
to the recursive behavior I added to handle criss-cross merges. one
thing leads to another!

how can this be resolved?

   1. bob can avoid fetching from his own depot. this will make it
      less likely (but not impossible) that he ever encounters the A
      -> C edge, or at least less likely that he encounters it before
      he has to merge C and D.

   2. bob can ignore the situation since A is probably not a bad LCA
      anyways, and monotone's merger will *likely* resolve the
      repeated part of the merge3(A,D,C) gracefully (the A->B change
      appearing twice).

   3. as you hint, we can try to teach the LCA algorithm to "ignore
      these redundant edges". either by making the A -> C edge somehow
      specially marked, or adding a secondary cert telling LCA to ignore
      it, or by committing a local (non-posted) inhibitor, concurrently
      with generating the A -> C edge for posting.

   4. as you are more broadly suggesting all through this email: never
      synthesize the A -> C ancestry cert, at all. repost what you
      have and forget about aggregation.

> Should we be worried about the transitive trust issues here?  If I
> automatically and uncritically create all these certs based on other
> people's ancestry certs, is that bad? 

well.. perhaps a little. ultimately if someone is sending me code, and
I'm incorporating it because I trust them, they have a way into the
code I share with my friends anyways, whether done with certs, or
simply code of theirs which I merged. but if it creeps people out
synthesizing certs based on certs, we should turn it off by default.

> A possible solution to all these problems would be to, instead of
> trying to come up with clever ways to re-cert things, simply pass
> around ancestry certs promisciously; upload full ancestry cert graphs
> and accept that the actual contents of any given revision may not be
> available. 

yeah. we can call this depot strategy "reposting", contrast with the
current strategy which is "aggregation". reposting is certainly one
way to solve it. the aggregation idea seems to introduce enough
possible failures and spooky actions that it might not be worth
supporting at all (or putting in a hook which is, by default, 'return
"repost"')

> This solution also raises trust issues, because without any
> re-certing, you have to trust each and every key or parts of the
> ancestry graph become unavailable; this is the other advantage of the
> current re-certing mechanism, that you can get a coherent view of
> history by looking at only a single depot _and only trusting people
> with commit access to that depot_.

I wouldn't think of it in terms of "commit access". depots happen to
have a set of keys which are permitted to post to them, but that's
just to prevent abuse by random strangers. I (address@hidden) can
still post copies of ancestry certs signed by you (address@hidden) to
my depot, even though you have no "commit access" to my depot.

there are 2 problems which occur in the "reposting" strategy:

  - someone tracking my depot needs all the public keys of people I am
    absorbing changes of, or else parts of the graph vanish.

  - I post more bytes, overall, since I'm not fusing edges.

since neither is a major problem, and more importantly neither is a
*subtle* problem, I think that makes "reposting" desirable. I should
probably also point out, in case these cases sound too painful to
bear, that the "monotone propagate" command can fake something quite
similar to the aggregation strategy, using different branch names to
separate the aggregate stream from the reposted. that was its intent
anyways. 

> brainstormingly yrs,

keep it up,

-graydon

[Prev in Thread]

Current Thread

[Next in Thread]

[Monotone-devel] README library list and log command fixes, Matt Kraai, 2003/10/19
- Re: [Monotone-devel] README library list and log command fixes, graydon hoare, 2003/10/19
  - Re: [Monotone-devel] README library list and log command fixes, Nathaniel Smith, 2003/10/19
    - Re: [Monotone-devel] README library list and log command fixes, graydon hoare, 2003/10/19
    - Re: [Monotone-devel] README library list and log command fixes, Nathaniel Smith, 2003/10/19
    - Re: [Monotone-devel] README library list and log command fixes, graydon hoare <=
    - Re: [Monotone-devel] README library list and log command fixes, Nathaniel Smith, 2003/10/21
    - Re: [Monotone-devel] README library list and log command fixes, Nathaniel Smith, 2003/10/21

Prev by Date: Re: [Monotone-devel] README library list and log command fixes
Next by Date: [Monotone-devel] Error: std::runtime_error: host www.off.net not found
Previous by thread: Re: [Monotone-devel] README library list and log command fixes
Next by thread: Re: [Monotone-devel] README library list and log command fixes
Index(es):
- Date
- Thread