monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] Re: current multiple heads (was Re: write access to


From: Asger Ottar Alstrup
Subject: Re: [Monotone-devel] Re: current multiple heads (was Re: write access to my public server)
Date: Fri, 30 Apr 2004 09:16:02 +0200
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6) Gecko/20040113

graydon hoare wrote:

 - totally overhaul monotone's internal storage structure to represent
   versions primarily as sets of changes (a la darcs and arch). this is
   the most expensive solution, but I've done expensive things here in
   the past and I'm not completely adverse to the concept. my major
   concern with this is that set-like representations grow forever,
   which penalizes users for having successful, large-scale and
   long-lived projects. nonetheless, it is an option; arch has its
   "cached versions" and darcs has its "checkpoint patches". maybe that
   makes more sense.

I think a representation like this is better, since it seems so much simpler. A repository is a set of changes, organised in a DAG. To help solve the problem of space, distinguish between the relations between changes and the contents of the changes. So each edge in the DAG points to a patch, but the patch itself can be missing, because it has been compressed away.

In order to work on a repository with others in a distributed manner, it should be sufficient to know all the relations between changes, but not the exact contents of those changes. In other words, all users should have the full DAG, but can omit many of the patches themselves.

So, yes, the growth of a repository is linear in the number of changes, but the recording of each change is fixed in size. Of course, the space requirements for representing all history is the sum of the size of each change.

In order to control which patches to drop, each user can define his own policy. In order words, the user can define a compression policy.

A compression policy will combine patches to reduce the size. So, if we have a DAG with a long path like A -> B -> ... -> Z, instead of keeping each individual patch from A->B, and then from B->C and so on, the system can compress the history to a single patch A -> Z. This can then be inserted into the DAG as a special "compression" edge. This operation will not remove the edges in the DAG from A -> B, B->C and so on, but it will delete the patches linked to those edges, because it is replaced by a new link from A -> Z with a link to the combining patch.

Such a compression polich can be specified to work on a number of different criteria:

- Prune content based on time - let the granularity of changes increase back into the history - we can only restore versions every X days.

- Prune content based on space - reserve X MB for patches, or X %

- Prune content based on replication elsewhere - if host X has the history, I do not need it. A variant is "compress aggressively when we know all other hosts I work has my changes".

- Mark nodes in the DAG as checkpoints, which can not be compressed away.

- Keep only these lines of developments (i.e. ignore this and that branch)

- Keep only the history of text files, not binary files

If a client explicitly tries to restore a node in the DAG which has been compressed away, the client could try to contact other known servers to try and find that version.

Such a system would be flexible enough to support a wide range of use cases: In one end, we have the working mode where all can afford to keep the entire history for fast access, in the middle we have a policy where one or more central servers keeps all changes, while each client only keeps a subset of the history according to whatever polich, and at the extreme end, we have the situation where the system is only used to safely distribute changes and define current versions, but the history itself is not relevant.

Best regards,
Asger Ottar Alstrup





reply via email to

[Prev in Thread] Current Thread [Next in Thread]