[Monotone-devel] [RFC] versioned policy -- bootstrapping

monotone-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] [RFC] versioned policy -- bootstrapping

From:	Nathaniel Smith
Subject:	[Monotone-devel] [RFC] versioned policy -- bootstrapping
Date:	Sun, 10 Sep 2006 02:46:10 -0700
User-agent:	Mutt/1.5.13 (2006-08-11)

(For archive readers: this should be read after
  http://article.gmane.org/gmane.comp.version-control.monotone.devel/8169
)

The policy branch framework has a chicken-and-egg problem; we want the
trust rules in effect at any given time to be something like "those
written in the head of the policy branch"; but it's those very trust
rules that are supposed to let us determine what counts as a head.

So we need some sort of special mechanism to get from a trust seed to
the current trust tree.  Furthermore, the details of this mechanism
matter a lot.  The wrong rules would make it possible for someone to
hijack a project, or make it impossible to fully revoke someone's
access -- neither is acceptable.  So there's some finicky technical
work to do here.  I'll warn up front that I don't have solutions to
all the problems.  But hopefully this will help bring everyone up to
speed so y'all can scratch your heads along with me :-).

The problem
-----------

The most important case to handle is an antagonistic fork.
Essentially, if a project has two admins with equal rights, it must
not be possible for either admin to unilaterally kick out the other
one, without recourse.  There's only one set of behavior I know of
that accomplish these apparently contradictory goals, so I guess it's
the desired one: one group of admins can kick out another group.  Once
this has happened, that other group cannot re-instate themselves.
But, what they _can_ do, is retaliate by dragging the project to a
halt, until individual users decide which side of the fork to follow.

Define the /candidate set/ to be the set of revisions descended from
the trust seedwhich are viable choices for the current trust tree.
Think of this as being like "heads", without committing up-front that
this is what they actually are.  Ideally this set has exactly one
member.

I propose a property, let's call it (*), that the candidate set should
satisfy: Once a revision R enters the candidate set, it cannot be
pushed out except by the arrival of some revision that _R_ believes
should replace it.

Define a revision to be /reachable/ from the trust seed if at some
point it was in the candidate set.

Another statement of (*) is: resetting your trust seed to some
reachable descendent, should only ever remove revisions from the
candidate set, never add them.

Okay, right, that needs some motivating examples.  Here's a straw-man
algorithm for determining the candidate set:
  a) set the current trust rules to match those defined in the trust
     seed
  b) find the head(s) of the policy branch under the current trust
     rules
  c) set the current trust rules to match those given in the current
     head(s)
  d) repeat steps (b) and (c) until the algorithm converges.
This algorithm seems kind of nice.  It always converges (because it
always moves downwards in the graph, never up, at least if you make
some mild assumptions), and you always end up with your candidate set
equal to the heads of the policy branch.

However, consider an antagonistic fork:
     A   <-- states that Alice and Mallory have commit access
     |
     B   <-- states that only Alice has commit access; signed by
     |       Alice
     C   <-- states that only Mallory has commit access; signed by
             Mallory
A is the trust seed, so we start out trusting both Alice and Mallory.
Under these rules, B and C are both members of the branch.  To find
the branch head, we run erase_ancestors({B, C}), and get C.  At this
point the algorithm has converged, so Mallory wins.  Obviously this is
not the desired result.  Not only does his name mean that he's
obviously evil, but even if (in a shocking double-cross) Alice turned
out to be the evil member of the scenario, it's clear that she could
simply commit another revision and trust would switch back to her.
The result is trust rules that oscillate without end, rather than
converging to contradiction.

So this isn't going to meet our original requirements.  But we can
also use it as an example to explore property (*).  The first
statement of (*) was: "Once a revision R enters the candidate set, it
cannot be pushed out except by the arrival of some revision that _R_
believes should replace it."  Here, R is B.  Before C arrived, B
entered the candidate set; B left the candidate set when C arrived.
However, B states that C should not be trusted; therefore, C's ability
to push B out of the candidate set is a violation of the property.

The other statement of (*) was: "resetting your trust seed to some
reachable descendent, should only ever remove revisions from the
candidate set, never add them".  In this case, if C were not here, B
would be in the candidate set, so B is reachable.  Now set our trust
seed to B.  In doing this, the candidate set changes from {C} to {B},
which is not a simple shrinking.

Hopefully it is clear at this point why the naive algorithm fails, and
how the failure is tied up in how it lacks (*).

Another try
-----------

Here's an algorithm that does work:

Call a revision R with cert C "trivially reachable" from some other
revision S, if R is a descendent of S, and C is a cert trusted by S
which claims that R is in the policy branch.  Trivial reachability
defines an overlay graph on the revision graph, where you draw an edge
between two nodes when one is trivially reachable from the other.
It's the heads in _this_ graph which form our candidate set, and full
reachability is simply equivalent to reachability in this graph.

Looking at our example again:
     A   <-- states that Alice and Mallory have commit access
     |
     B   <-- states that only Alice has commit access; signed by
     |       Alice
     C   <-- states that only Mallory has commit access; signed by
             Mallory
We have that B is trivially reachable from A, and C is trivially
reachable from A, but C is not trivially reachable from B.  Drawing
this as a picture we have:
     A
    / \
   B   C
Clearly, the candidate set is {B, C}.  Monotone can detect this and
throw up its hands.  Because of the second formulation of (*), though,
we are guaranteed that users can update their trust seed to point to B
or C, and from that point either Mallory or Alice is fully gone.

This all seems a bit complicated and confusing, but I don't have any
really better idea.  We need something like (*), and apparently that
inevitably means that there will be some disagreement between the
"candidate set" and "heads of policy branch"?

Multiple candidates
-------------------

In general, there is a question of how to handle the case when there
are multiple policy candidates.  The simplest thing to do is simply
throw up our hands and refuse to go on.  We have to do this in some
cases, like the one above, but it might turn into a problem in
practice -- e.g., if two admins both independently add a new user,
they will create accidental divergence until they merge.  Stopping
everyone from working until this merge happens might cause problems
(then again, this might be rare enough to not be an issue in
practice).

If this is an issue, one option would be to only fault when
_disagreement_ occurs.  I.e., whenever we need to know whether
something is permitted, we query every member of the candidate set,
and if they all agree, we go with that.  If they do disagree on
something, but it turns out that we don't need to know either way
about that thing in order to perform our current task, then no
problem.

Remaining problems
------------------

I think the algorithm that's given here could work sufficiently well
if we just want to solve the first-pass problem I posed in the
/previous email -- i.e., if we track branch-level trust _only_.  If
that's all we're doing, then the ability to commit to the policy
branch is equivalent to full project admin access, similar to "owning
the domain name" or "has root on the project server" in other
permissions systems.

That makes it okay that with the algorithm given here, anyone who
could ever commit to the policy branch forever retains the ability to
stall the project -- this is inevitable.  However, it's clear that we
will want to have _some_ kind of additional metadata, including
metadata that non-admins can modify.  (For example, if we move the
branch namespace into the policy branch, then you will need to make a
commit to the policy branch merely to start a new branch!)

So, consider a situation like:

     A   <-- Alice can change policy, Tom can create branches
    /|
   / B   <-- Alice says Tom's access is revoked
  /
 C       <-- Tom adds a branch

Somehow, it seems like B has to beat C here, because otherwise we can
never make anyone go away.  If that's true, then (*) is actually not
the right property, because C is certainly a candidate when B is not
present.

I don't know how to solve this; losing (*) makes me nervous.  One dumb
answer would be to split things into two branches -- a "trust branch"
and a "metadata branch", where we do special bootstrap stuff for the
first branch, which contains sensitive permission information, then
treat the second branch as an ordinary branch.  This would be a
nightmare to explain to users, though, and seems really inexpressive,
because there are only 3 levels of possible access we could ever grant
("god", "all metadata but that's it", "nobody").

If we do have to lose (*), then there is one ray of hope -- we can
distinguish the Alice/Tom situation from the Alice/Mallory situation,
because in the Alice/Tom situation, revision C says that the A->B
transition was good (because Tom doesn't have the permissions needed
to revoke Alice's access, we know that C still states that Alice has
admin access), while revision B says that the A->C transition was
not.  In the Alice/Mallory situation, the two heads each reject the
other.  Perhaps that will let us get somewhere.

-- Nathaniel

-- 
Damn the Solar System.  Bad light; planets too distant; pestered with
comets; feeble contrivance; could make a better one myself.
  -- Lord Jeffrey

[Prev in Thread]

Current Thread

[Next in Thread]

[Monotone-devel] [RFC] versioned policy -- bootstrapping, Nathaniel Smith <=
- [Monotone-devel] Re: [RFC] versioned policy -- bootstrapping, Wim Oudshoorn, 2006/09/10
  - Re: [Monotone-devel] Re: [RFC] versioned policy -- bootstrapping, Timothy Brownawell, 2006/09/10
    - [Monotone-devel] Re: [RFC] versioned policy -- bootstrapping, Wim Oudshoorn, 2006/09/10
  - Re: [Monotone-devel] Re: [RFC] versioned policy -- bootstrapping, Nathaniel Smith, 2006/09/10
    - [Monotone-devel] Re: [RFC] versioned policy -- bootstrapping, Bruce Stephens, 2006/09/10
    - [Monotone-devel] Re: [RFC] versioned policy -- bootstrapping, Wim Oudshoorn, 2006/09/10
- Re: [Monotone-devel] [RFC] versioned policy -- bootstrapping, Bruce Stephens, 2006/09/10

Prev by Date: Re: [Monotone-devel] Error handling via automate
Next by Date: Re: [Monotone-devel] Error handling via automate
Previous by thread: [Monotone-devel] [RFC] DAG-based revision refinement
Next by thread: [Monotone-devel] Re: [RFC] versioned policy -- bootstrapping
Index(es):
- Date
- Thread