[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Revision control

From: olafBuddenhagen
Subject: Re: Revision control
Date: Sat, 28 Jun 2008 04:24:50 +0200
User-agent: Mutt/1.5.18 (2008-05-17)


On Thu, Jun 26, 2008 at 04:50:09PM +0200, Arne Babenhauserheide wrote:
> Am Donnerstag 26 Juni 2008 00:25:28 schrieb olafBuddenhagen@gmx.net:

> > Mercurial's interface *might* be more intuitive than Git for CVS
> > users (I can't tell); but that doesn't necessarily mean that overall
> > usability is better.
> It doesn't necessarily mean that. But having worked with Mercurial and
> with Git, Mercurial feels a lot more usable to me. 

Well, as you seem to have used Mercurial much more and longer, that
isn't really surprising :-) If I learned Mercurial now, I'd most likely
hold exactly the opposite opinion... That doesn't really tell much.

> > I don't know how Mercurial stores branches; but I very much hope
> > that it doesn't need to read all the stuff from unrelated side
> > branches ond/or ancient history when accessing a file, either...
> > Otherwise, its efficiency must be *much* worse than of Git.
> It has index files which it reads and sees which parts of the
> datafiles it needs. Also the file-data has occassional snapshots of
> the whole file (as soon as the compressed diff-data exceded the
> compressed file-size). 
> So it just reads a very compact index file and after that only the
> parts of the files it needs. 

Which, I think, is pretty exactly the same as Git does... So we can
assume that as long as the number of touched files in Mercurial is about
the same as the number of touched packs in Git, performance will be

> > > For the Linux kernel, you need to "sanitize" yourself, but in
> > > smaller projects, the individual history might be very interesting
> > > to other developers.
> >
> > Sorry, this is bullshit. There is absolutely no reason why any
> > project -- small or large -- should have the history littered with
> > individual developer's meandering. It has no value whatsoever, and
> > only makes it harder to understand.
> One example: - Tracking when a bug was introduced, and why. Also after
> the developer left. 

Actually, this is much easier with a sanitized history. It's always
clear when and why a change was introduced -- while otherwise, with
omission and later amends, mistakes and later reverts, prototypes and
later cleanups, experiments and later switching approches, it is much
harder to make sense of it. It's just useless data getting in the way.

Bisecting is much easier with a clean history, too.

Anyways, Git doesn't force you to change history. It just makes it
really easy if desired, thus encouraging finding new workflows that
would be impossible or much more awkward with other systems. Git doesn't
focus on particular workflows; rather, it makes it easy to do all kinds
of things, and leaves it to the users to come up with the optimal

I don't think that when Linus designed Git, he was thinking "it must be
easy to change history". Rather, I suspect that the flexible design of
Git just made changing history easy as a byproduct, along with many
other things considered uncommon up till then. It encouraged
experimentation -- and people soon discovered that contrary to the
prevaling dogma about revision control, changing history can actually be
a very useful thing.

> > Seriously, garbage collection in Git is hardly a burden worth
> > mentioning -- it isn't needed nearly as often as to make it into
> > one.
> We feel different here, but I assume that's just different experiences
> and ways of working on things. 

Honestly, how often did you actually need to GC Git repositories so
far?... :-)

Admittedly, I haven't used Git terribly much either yet. So I guess we
are none of us qualified to judge how much it really matters in

> > So the argument goes like: Garbage collection would pose serious
> > problems in some hypothetic use case -> garbage collection is bad ->
> > Git is bad -> we shouldn't use Git for the Hurd repository, even if
> > it has no relation whatsoever with the problematic use case?
> > 
> > Don't you think this is a bit silly?...
> No. This was only about garbage collection, for me, since that was
> where the argument got into "generally good idea vs. generally bad
> idea". 

Well, such a discussion will always tend to touch more general question.
However, when I was pointing out that it doesn't matter for the Hurd
repository, I was obviously not trying to make a statement about GC in
general, but to come back to the actual topic at hand :-)

> Git has the stronger emphasis on "here are the commands, play with
> them", while Mercurial has the stronger emphasis on "just use it and
> check the more complex things when you need them". 

Actually, the more we are discussing this, the more I'm beginning to
think that Git's steeper learning curve is not only a worthwhile price
to pay, but actually a *good* thing in itself...

With a flat learning curve, people can start using a tool with only
little knowledge; but whenever they need something new, they have to
learn again -- it's always an uphill battle... And there is little
motivation to learn, as they don't even see the benefits it would give
them. So people rather tend to struggle on and on with their limited

By requireing to learn more stuff from the beginning, Git on the other
hand makes every user into an expert -- thus indirectly helps to use the
tool really efficiently...

> (besides: Mercurial has the record extension for doing that fine
> grained: Accept or reject every single change in your working copy.
> Just say $ hg record and it asks you about each file if you want to
> include it, and then about each change in that file. I assume Git can
> do that, too.)

Of course. (Using "git-add -i", or "git-gui".)

> I use partial commits, when working on texts to group only related
> changes together with one commit message, so I know their merits. 
> And this is a reply to 
> > I fail to see anything specialized in Git's interface, for big
> > projects or 
> otherwise.
> To show, that it is optimized for some workflows, just like every
> other efficient system. 

I don't think that the behaviour of "git-commit" vs. "git-commit -a" has
anything to do with optimization for any particular workflow.

Perhaps it's simply historical -- maybe "-a" was introduced later than
the other variant. Or perhaps it is considered less annoying as a
default action. (Commiting too much by accident if "-a" was default
would be worse than the command simply failing if forgetting the "-a" as
it is now. It's actually something that bothered me with CVS quite

But most likely, it's simply because committing the index is the basic
action, while automatically finding the files to commit is an additional
feature; and "-a" is easy enough that the Git developers didn't see a
need to abstract this fact.

I am sure that with most *any* kind of workflow, "-a" is more often used
than not. So it is *not* optimizing for a particular workflow. Rather,
it's simply a manifestation of Git's interface being very direct;
avoiding abstraction where it's not important to have it. It's a
manifestation, in fact, of *not* optimizing for specific workflows.

> > I on my part seldom type long commands by hand -- usually I get them
> > from shell history. For things I do really often, I can always
> > create a shell alias. You could alias "gca" to "git-commit -a" and
> > "gu" to "git-checkout -m" for example... That beats even the hg
> > variants ;-)
> And is possible to do with hg just the same, but in Git you have to do
> it yourself instead of having the comfortable commands supplied by
> default. 

You are missing the point. If you really want it comfortable, you need
to create your own aliases anyways, so the default shortcuts don't
really help much.

One might argue that they actually do harm, by reducing the motivation
to create individual aliases... ;-)

> But checkout doesn't work the same in subversion, which is what I was
> used to before switching to Mercurial. 

I really don't care about Subversion. I can see some little merit in
trying to be similar to CVS, because that's the least common
denominator, what has been used for ages, what almost everybody knows.
Subversion on the other hand is nothing else but just the least useful
one among the newer systems.

But anyways, this is not really the point :-) The point is that
"git-checkout" is different from both "cvs update" and "svn update", so
it would be only confusing to call it the same. (While *making* it the
same would be less consistent and logical.)

> "svn checkout" gets a repository onto your disk (like "git clone"/"hg
> clone") and "svn update" updates the data (like "git checkout"/"hg
> update"). 
> But "svn update" and "cvs update" both update the working directory. 

And what about switching branches? In CVS, this is usually done with
"cvs update" as well, although the action is actually more similar to an
initial checkout, and calling it "update" doesn't make any sense. (In
fact, the initial checkout is just a special case, switching from the
NULL branch to a real one...)

"git-checkout -m" can also be used to switch branches (while carrying
along local changes), so even my previous suggestion of aliasing
"git-update" to "git-checkout -m" wouldn't make sense.

Essentially it's all the same action (syncing from repository to working
tree), even if it can be used for different purposes in different
contexts; and using the same command makes perfect sense. It allows the
user to see it as it really is.

In CVS, merging branches and reverting is also done with "update".
Should that be imitated as well, for convenience of former CVS users?...

"git-checkout" requires getting used to, but it's a fact that it is more
consistent and logical, and helps in the long run. Which IMHO is true
for many things in the Git UI.

> > But that would only bloat the command set; and worse, it would hide
> > the fact that the commands are essentially the same, thus preventing
> > the user from gaining a true understanding.
> I can't go with this one. 
> How is $ git checkout
> the same as $ git checkout .
> The former says what I changed, the latter gives me changes I pulled
> beforehand. 

Actually, they do *almost* the same: Both check out files from the
repository to the working copy, but abort and warn if there are local
changes. The only difference is handling of files missing in the working
copy: The first treats them as changes and fails, while the second just
checks them out.

It happens that in your specific situation, because of this slight
difference, the one command failed with a warning, while the other did
work. That doesn't make them fundamentally different actions.

("git-checkout" without parameters on a checked out working tree will
actually always either fail or do nothing... It just doesn't make sense
in this context.)

> If a tool doesn't feel familiar after 15 min (or rather after some
> hours), then I will have to wrap myself around the tool, and it is
> likely to be inefficient on the long run, even though I might not even
> notice it anymore, because I got used to it. 

This is a baseless claim. Something that is harder to learn, might
indeed turn out problematic in the long run. It can just as well turn
out perfectly efficient and reasonable, just different. "Different" is
not always bad. "Different" can actually be a good thing.

"Ich weiß nicht, ob es besser wird, wenn es anders wird, ich weiß nur,
dass es anderes werden muss, wenn es besser werden soll." -- Georg
Christoph Lichtenberg

> But at least I think we managed to carry this discussion through, even
> though it seems we'll have to agree to disagree. 
> And we took a bit longer than the target of "arguing for one week" :) 
> I hope the others in this list didn't mind the long posts. 

I doubt the others on this list are still reading this thread :-)

> To finish it, I created a small side by side comparision of Git and
> Mercurial which I hope I managed to keep neutral. 

>       Hg      |       vs              |       Git
>       +       | documentation |       

I don't agree here. IMHO the Git documentation is perfectly good.

(The rest of this part seems fine.)

>               = Usability =
>       Hg      |       vs              |       Git
>       +       | short commands 
>               | and basic 
>               | operations 
>               | very easy     |        
>       +       | commands 
>               | similar to SVN 
>               | where possible | 
>       +       | less error 
>               | prone -> don't 
>               | have to 
>               | understand 
>               | every part to 
>               | be able to 
>               | use it                |             
>               | easy rebasing |       +
>       +       | just works    |       

I don't fully agree on your conclusions, but more importantly, I don't
agree at all to what constitutes "usability" in your list :-)

And what is "just works" supposed to mean, anyways?...

> Both are very powerful, but Git focusses more on users who want to
> learn it in depth, while Mercurial focusses more on users who want
> something that just works and has very few pitfalls, so they can read
> up on advanced features later on. 

I think I can agree to this conclusion :-)

> What's left is finding out, which one will be best for the Hurd, and I
> think neither Olaf nor me are qualified to really "vote" on that (at
> least I am not, because I mostly work on the Hurd wiki, not on its
> code). 
> So I pass that question to the main Hurd contributors, now. 

The main committers over the past few years have been Thomas Schwinge
and Samuel Thibault.

Thomas set up the Git-based wiki, so I guess his preference is clear :-)

I asked Samuel now, and he said he doesn't care, though on further
questioning he admitted a preference for Mercurial.

Neal Walfield has been commiting a lot recently in the hurd-l4 module.
As this is totally seperate though, there is no reason why it must use
the same VCS... (I have no idea about his preference.)

We also have the GSoC students presently, which are devided as well.

So this leaves us still in the same place I fear...


reply via email to

[Prev in Thread] Current Thread [Next in Thread]