monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] Re: user-friendly hash formats, redux


From: Oren Ben-Kiki
Subject: Re: [Monotone-devel] Re: user-friendly hash formats, redux
Date: Mon, 6 Dec 2004 00:12:00 +0200
User-agent: KMail/1.7.1

On Sunday 05 December 2004 21:39, graydon hoare wrote:
>    - another local sequential system might involve keeping a sequence
>      number for each author, sorted by date, such that the numbers go
>
>        "derek-10", "graydon-12", "matt-72", "joel-13", "derek-11"

Neat, but I don't see it solves the core problem though. First, I still 
don't know what the revision before gordon-206 is. Is it gordon-205 or 
matt-71?

>      I posted a couple days ago a proposal -- quite honestly -- for
>      assigning CVS-style "local" x.y.z sequence numbers, which is
>      relatively easy to do now that we have a fixed DAG-shaped
>      revision graph.

(An aside: I never could stand CVS branch naming. I always wondered why 
they don't use letters for branch ids: I find 3a.12 to be "clearly" 
revision 12 of branch 'a' of revision 3. In contrast I need to work on 
parsing 3.1.12. And compare the readability of 3d.4b.11 and 
3.4.4.2.11... is it just me?).

A problem with x.y.z (if I understand it correctly): My db has 1 --> 2 
--> 3. Yours has 1 --> 2 --> 3'. We sync. Now, in my db, 3' is 2.1.1, 
while in yours, 3 is 2.1.1. If we communicate to each other, we can't 
use the numbers "3" and "2.1.1" reliably.

There seems to be an inherent trade-off. What is more important:

- Stable local ids, or

- Consistent ids across dbs after a sync?

The x.y.z system you suggested is stable; doing a sync doesn't change 
any id. Here's a different system that has consistent ids after a sync: 

Since most of the time its different people that cause a fork, use the 
author's name. That is,

   1 --> 2 --> 2.mine.1 --> 3 (merge kills forked id)
           \-> 2.your.1 -/

It does requires something extra if the same author causes the fork:

   1 --> 2.mine/a.1 --> 3
     \-> 2.mine/b.1 -/

So, after a sync, you get consistent revision ids; you can safely talk 
about 'x.mine.y' and 'x.your.y'. However, this requires potentially 
massive renumbering:

   1 --> 2 --> 3 --> 4 -> ... (mine)

   1 --> 2' --> 3' (your)

   1 --> 2.mine.1 --> 2.mine.2 --> ... (after sync)
     \-> 2.your.1 -> 2.your.2

And if you never want to merge 2.your.* into the head, you need to 
somehow "kill" that fork, after which numbers are, again, massively 
changed.

Nasty trade-off!

You can avoid it, with a price: have a strong notion of a branch's "main 
line" (or "trunk").

If I understand correctly, Monotone today doesn't have this notion; all 
heads are equal, and you need to merge before you have a "latest 
revision". But suppose that a certain path of revisions was "blessed" 
somehow (the "main line"). It would always be numbered 1 - 2 - 3... 
You'd need something like the branch's key to mark a revision as "main 
line".

All other revisions are "forks", so even if the main line is 1 - 2 and I 
add a new revision based on "2", it will be "2.mine.1" unless I sign a 
cert with the right key saying it is "main line", in which case it will 
be "3".

This way, revision ids would be stable _and_ consistent after a sync; it 
is as close to a globally unique id as you can get.

True, it would not _completely_ eliminate re-numbering, but it would 
only renumber revisions when you force it to in truly pathological 
ways: You'd need to create two different forks in two different dbs, 
starting from the same revision, and then sync the dbs. A variant of 
this is signing both the two conflicting forks as being "main line"; 
the system will have to (arbitrarily?) decide which of them is a fork. 
Spanking the author would also be a good idea at this point :-)

An advantage of this is you can always get "the" latest revision of a 
branch, the "main line" one, even if there are several heads (all the 
others are "forks").

It also makes the <author>-<seq> idea redundant - or, seen another way, 
it subsumes it, since you get <x>.<author>.<y>. In fact it is even 
better, because <y> its sequential so you know which revision comes 
before/after which.

It does make "forks" something more "solid" than today - they obtain 
some of the functionality of a full "branch". Its the 
intent-vs.-mechanism issue again. A "fork" is intended to be merged 
"ASAP", a branch is intended to create an alternative release and may 
never be merged.

Hmmm. Not the most elegant system in the world, but it seems like it 
should work...

> I'm willing to completely toss out the selector stuff we have now if
> nobody's using it. it's an experiment.

You have created a small query language for revision ids. Being able to 
pick a revision based on a general query is a good thing. Creating a 
private query language has a lot of disadvantages, though. Perhaps you 
should allow the use of an SQL query instead?

> ... here is a concrete proposal: what would happen if the
> command line accepted revisions in any of 3 forms:
>
>     --hash   or -h  <id>                global hash identifier
>     --seq    or -s  <author>-<seq>      local sequence numbers
>     --rev    or -r  <x>.<y>.<z>...      local revision numbers

Get rid of --seq; make --rev use the <x>.<author>.<y> notation, and add:

      --sql "SQL query"                   full-powered query

> and we ask a hook for your preferences as far as which to print out
> (possibly all three) when listing logs, status, etc.

Using only --hash and --rev, it makes sense to always print both forms:

    4.gordon.2 = a47c12d....

Have fun,

 Oren Ben-Kiki




reply via email to

[Prev in Thread] Current Thread [Next in Thread]