[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Arx-users] Repo format take II

From: Kevin Smith
Subject: Re: [Arx-users] Repo format take II
Date: Tue, 20 Dec 2005 09:54:23 -0500
User-agent: Mozilla Thunderbird 1.0.7 (X11/20051011)

Walter Landry wrote:
Kevin Smith <address@hidden> wrote:

Walter Landry wrote:

At a glance, the formats of those directory names all look quite similar. Would there be value in naming them such that you could tell just by the name what kind of directory this is? Perhaps a leading 'C' or 'S'?

There is no room in the name of the skip delta for another character.
The two hashes take up 30 characters, and the skip level makes it 31.

For the other types, I don't think it will help that much to precede
them with a special character.  The file names are a jumble of hex
characters already.

Ok. So it's ok to have two completely different "types" of directories using an identical naming scheme. Either you never care which "type" a directory is, or it's ok to read the contents of a directory to determine it's "type". I was just thinking that there might be times (especially over slow remote links) where it would be useful to know that you don't have to read any contents of that directory because it's not an interesting type. I didn't have any specific cases in mind, but I know in my own work it is usually helpful to have naming strategies that allow type detection without extra reading.

Just curious why the T is at the end rather than the beginning. I can imagine some possible reasons, but would like to know the real one(s).

No particular reason.  It satisfies my internal sense of aesthetics,
but I don't feel strongly.  The hashes are fixed length, while the
following string's length can be anything zero and up.


I'm not familiar with the term "terminal revision". Could something be both a patch revision and a terminal revision at the same time?

Terminal revisions tell ArX that that revision should not be
considered when looking for HEAD.

> Actually, a "terminal revision" makes a terminated microbranch.

I still don't like terminal revisions, since it sounds too much like a leaf revision, which would in fact be a HEAD. How about "terminating revision". Or "dead revision" or "killed revision" or something that reflects that this case only happens when the user kills it. Along those lines, a name ending with X would make sense, although T is ok too, if it's not needed for Tag.

The first part is the directory where the repo is, the second is the
project name, which is currently implemented as a directory
"project.d".  But that last part is opaque to the user.

Ok. They still seem like distinct entities, but it seems fine to start this way, at least.

Also, I would hope that the UI would allow a default repo so the user would only have to specify the project name.

That is one of the things that I was hoping to get away from.  I heard
too many complaints about default repos, and it always kind of grated
on me.  I am not vetoing the idea, but I would like to see how things work
without them.  Monotone manages to get by without them.


He creates more revisions.  When he commits revision 32, that also
creates a skip-delta back to the first revision

I thought skip-deltas typically relied on the random creation of links.

I do not know what you mean here.

Doh! I was thinking of skip-lists, which are very similar, but which are typically generated by inserting elements in a random sequence. Never mind.

Would this actually be hard-coded at 32?

That is my current thought.  That makes the total required space
N(1+log(N)), which for 65000 revisions is about a factor of 4 bigger
than N.  That is a worst case scenario, and I don't think that
Subversion has seen such big factors in practice.  I will certainly
run tests on the gcc repo before setting anything in stone.

Also, with 32 that gives you 31+31+31+2=95 patches that you need for
revision 65535, and I wanted to keep that number under 100 (why 100?
No particular reason).

> 256*256=65536, and 60000 revisions is a design goal.  So that would
> make two levels of 256 directories each, and should not degrade too
> much.

I know it really doesn't matter, but using numbers like 256 and 65000 add to the nerdy-ness of the project, which already has a nerds-only inside-joke as a name (ArX). Unless there are compelling technical reasons for using these binary numbers (which will at least be partly exposed via documentation), I would prefer using numbers that are easier for non-nerds to deal with, like 100 and 100000.

Random thought: Why not name the top-level directories 1, 2, 3 instead of 0, 256, 512? There could be a value written in a repo config file that indicates how many revisions go in each directory, so it's not locked in stone for every repo on every underlying filing system for all time. Or maybe 0000, 0001, ... so they are all the same length, and are sortable.

which just copies everything over.  He periodically resyncs, and the
dirhash files mean that he only has to list directories that have
changed.  He hacks by getting revisions out of his own repo,
committing, and merging.

 arx get panza_repo,project project_tree
 cd project_tree
 commit -m "cheaper, faster, better"
 arx propagate /home/quixote/repo ../panza_repo
 arx merge

So "propogate" would merge at a repo level? It's late and I'm tired so I'll assume I'm missing something.

I am not sure what you mean by "merge".  It is not doing what monotone
does during "merge".  Propagate just copies revisions arounnd.

Ah, the monotone model. So propagate would pull other folks' branches into my repo, but wouldn't affect my own branches. What if I have two repos and have worked on the "same branch" in both of them? I can see that it could pull in my "other" revisions without conflict because they are named by their hash. But wouldn't the sequence numbers collide? Last time I looked at monotone, it didn't have sequence numbers.

Yep.  Though I think relocate and propagate are two commands that
people are going to need to use.

Yes, but only after they are familiar with the basics. If I understand relocate (and I don't think I do), it would only be used rarely.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]