monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] newcomer (rude, but hopefully not to rude) question


From: graydon hoare
Subject: Re: [Monotone-devel] newcomer (rude, but hopefully not to rude) questions
Date: 12 Sep 2003 18:50:53 -0400
User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2

Tom Lord <address@hidden> writes:

> Isn't it a consequence of that that logically equivalent files have
> no stable identity across revisions?

I suppose this depends on what you mean by logically equivalent, and
somewhat also depends on why you care. the only thing which really
cares about this stuff is the code which devises minimal patch sets to
transmit and displays a summary of work to the user (and I guess
"annotate", when we get around to that). but it's not *critical* that
this code always gets this right, because you can always correct it or
re-send something different, or just live with a move degrading to a
delete+add pair. but anyways, the current rules are:

  - if the files have the same name but different SHA1, it's assumed
    to be an edit

  - if the files have different names but the same SHA1:
    - if it's the first such pair in the tree, it's assumed to be a
      rename
    - if it's the second-or-later, it's assumed to be an addition

  - if the files have different names and a different SHA1, it assumes
    it's a delete+add pair. this will shortly be refined to accept
    an explicit certificate which associates the old and new version
    and path pairs within those two manifests, which will give it a
    better chance of noticing edit+rename operations correctly.

> And isn't it also a consequence of that that logically distinct files
> may have no unique identity aside from relative path in revisions
> where the relative path has not changed?

correct. identity of files is imo pretty transient. it's contents
which interest me, not which inode the file was in. I don't track any
sort of unique identifier for "a file over its lifetime".

> I might rename a file to one location and copy it to another in a
> single transaction, creating two files with equal SHA1 signatures
> but distinct relpaths, neither relpath being identical to the
> preceding revision.  Is it then possible to tell which is then the
> descendent of the original and which a new file?

perhaps. if you changed the SHA1 in the same instance as you moved the
file, then as I said it'll need an explicit cert to identify the
file. that feature is imo minor, but it's forthcoming, because it's a
regular enough thing to want and I see no harm in doing it.

you should note, however, that file ancestry information -- history in
general -- is never assured to be *complete* in monotone, just
self-identifying. you can have 'less than all' of history, and things
will carry on working, so long as you have full head versions. deltas
are stored backwards, and intermediate versions can be aggregated or
deleted. at some point the history of any file or tree trails off into
the ether, and it might not be at the Very First Revision Ever Made
Anywhere.

> As consequences of this, I don't see how inexact patching or
> out-of-band patching can work except in a limited subset of cases.  By
> inexact patching, I mean the application of a changeset between two
> arbitrary revisions to a third tree.   By out-of-band patching, I mean
> the exchange of changesets by means other than through the revision
> control system.

hmm. well, I don't see how "out of band" has anything to do with it;
you can always calculate and send any delta between any versions via
any medium you like, it's just a blob of ascii and carries no
obligations. if the recipient has the pre-version they can accept it
and use it, if not they might ask you for a complete version rather
than a delta. deltas are between blocks of data -- identified by SHA1
-- not "files" with external identity.

as for applying a changeset between arbitrary revisions to a third
tree: you can *try* applying it, but of course it might not
apply. that is, I think, true of any VC system: if the accepting
context doesn't fit the prerequisites of the delta, you cannot apply
it without manual intervention. I mean, I can't take a patch between
two versions of linux and apply it to freebsd. it won't land.

when monotone merges two versions B and C, it tries to find an
ancestor A, to adjust the A->B edge to fit the context of the A->C
edge. in theory I can add another command called "cherrypick" which
applies a smaller edge (say D->B where D is a descendant of A) to the
A->C context. that would have the best chance of making the patch land
properly, that I can think of. do you have a better approach?

> Finally, the cryptographic features of monotone are presumably
> oriented towards leveraging the human value of trust to achieve some
> practical effect.  Is there a nice concise summary somewhere of the
> the effect that is purportedly achieved and the proofs that the
> cryptographic techniques employed achieve that effect?

there is not a concise summary at the moment, and certainly no proof!

the first reason is obvious: you don't need to trust anyone during
communication because you can evaluate the trustworthyness of what was
said after the fact, and re-evaluate it if your opinion of the speaker
changes. so there's no communication protocol, nor any expression of
trust during communication. you speak and listen promiscuously, taking
packets any soruce that can provide them, but act conservatively based
on evaluation of signatures.

the second reason, more subtle, is to enable "monotonic update". when
you get packets from some other monotone user, you incorporate them
into your database, but you do not necessarily use them. when you run
the update command, it looks over your existing working copy and all
the available update candidates, and sorts them by whatever criteria
(expressed as cryptographic certificates) you like. it will only
update your working copy to something which ranks as "as good or
better" during the sorting phase.

so if, for example, automated test results, or code review, or
security warnings, or build results, or 3rd party usability
evaluations are expressed as certificates (which you can receive
out-of-band from anywhere), people will ignore versions which don't
measure up, and there's nothing the author of a version can do to
prevent it. it's intended to permit the recipients of versions
(customers, QA departments, etc) to ignore crappy versions, and to let
the recipients decide whose judgement to trust on what makes a version
crappy.

> Combatively, if interpreted pessimally, but collaboratively if not,

oh, I hope we will get along. I suspect we have similar goals, so a
little friendly competition should be good. if arch winds up absorbing
all the right ideas, I'll just use arch. I just want the problem
solved.

-graydon





reply via email to

[Prev in Thread] Current Thread [Next in Thread]