Re: Revision control

bug-hurd
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Revision control

From:	Arne Babenhauserheide
Subject:	Re: Revision control
Date:	Thu, 26 Jun 2008 16:50:09 +0200
User-agent:	KMail/1.9.9
Am Donnerstag 26 Juni 2008 00:25:28 schrieb olafBuddenhagen@gmx.net:
> You are confusing things. Usability is *not* the same as intuitiveness.
> Intuitiveness is only *one* element of usability. (And IMHO one that is
> overrated by most usability people...)

[...]

> Mercurial's interface *might* be more intuitive than Git for CVS users
> (I can't tell); but that doesn't necessarily mean that overall usability
> is better.

It doesn't necessarily mean that. But having worked with Mercurial and with 
Git, Mercurial feels a lot more usable to me. 

> > "hg pull" gets the changes from soemwhere else.
> >
> > "hg up" updates the files you see.
>
> I see.
>
> > Uh, what about Cogito?
> 
> From what I gathered, the standard interface of Git itself didn't
> initially offer a complete set of comfortable high-level commands. Now
> it does, and Cogito became obsolete.
>
> Or maybe it just proves my point that a stronger abstraction than the
> one offered by the standard interface is not really what people want in
> the long run... :-)

The git interface got a lot better, but it also changed (from what I read), so 
that cogito wouldn't be completely compatible anymore. 

So I think, these two are the main reasons why cogito isn't developed anymore. 

> open() is mostly a no-op, except for some bookkeeping; the file size
> doesn't matter at all. If by "open" you actually mean reading the whole
> contents of the pack: I'm pretty certain Git doesn't do that. Would be
> rather stupid.

Each object in the pack is compressed, so it can read only the needed objects. 
I don't know how they are ordered, though, which means, if the new objects can 
be read using streaming disk access (so most needed objects would get pulled 
into the disk-cache when accessing the first one). 

> I don't know how Mercurial stores branches; but I very much hope that it
> doesn't need to read all the stuff from unrelated side branches ond/or
> ancient history when accessing a file, either... Otherwise, its
> efficiency must be *much* worse than of Git.

It has index files which it reads and sees which parts of the datafiles it 
needs. Also the file-data has occassional snapshots of the whole file (as 
soon as the compressed diff-data exceded the compressed file-size). 

So it just reads a very compact index file and after that only the parts of 
the files it needs. 

And due to that it minimizes disk-seeks (that was one of the design principles 
of the Mercurial datastore). 

-> http://www.selenic.com/mercurial/wiki/index.cgi/Revlog

> > How often?
>
> All the time...

That's quite different from the workflow I use, then. 

> > For the Linux kernel, you need to "sanitize" yourself, but in smaller
> > projects, the individual history might be very interesting to other
> > developers.
>
> Sorry, this is bullshit. There is absolutely no reason why any project
> -- small or large -- should have the history littered with individual
> developer's meandering. It has no value whatsoever, and only makes it
> harder to understand.

One example: 
- Tracking when a bug was introduced, and why. Also after the developer left. 

> Making it impossible (or harder) to clean up the history, only
> discourages using version control aggressively. In fact, if you think
> all detours should be visible in the history, you can just as well stick
> with a centralized system and upload every change immediately...

First: Mercurial doesn't actively "makes it harder to clean up", it just 
doesn't yet provide the simple scripts for rewriting history. 

And "rewriting history" is definitely not the reason why I switched to a 
decentral system. 

> > In Git I have to care for the store from time to time, to avoid it
> > getting inefficient. so it gets in my way.
> >
> > I don't want to have to care for my tool.
> >
> > It's not my child. It's a tool.
> >
> > I want to care for the programs I write with it, instead.
>
> Sorry, my bad: I was thinking we were discussing practical merits, not
> philosophy :-P

Not having to care for my programs is a practical merit to me.  :) 

> Seriously, garbage collection in Git is hardly a burden worth mentioning
> -- it isn't needed nearly as often as to make it into one.

We feel different here, but I assume that's just different experiences and 
ways of working on things. 

For you it is no big deal, and that's certainly true for many people. 

For me it is a big deal, and that's certainly true for many other people. 

Else this discussion would be a "one of a kind" discussion, and there would be 
a blog post around with the definite answer which fits all, but it certainly 
isn't, and many people lead discussions like this one before. 
They reached different conclusions, depending on the needs and experiences of 
the people who discussed, and that's OK, too. 

Now we just need to find out which solution is the best for the Hurd. 

> The *only* situation where too big packs are a problem is when someone
> is pulling stuff through a dumb transport.
>
> And anyways, there is an option to limit the size of the generated
> packs, if you care.

That's good to know. 

> So the argument goes like: Garbage collection would pose serious
> problems in some hypothetic use case -> garbage collection is bad -> Git
> is bad -> we shouldn't use Git for the Hurd repository, even if it has
> no relation whatsoever with the problematic use case?
> 
> Don't you think this is a bit silly?...

No. This was only about garbage collection, for me, since that was where the 
argument got into "generally good idea vs. generally bad idea". 

If that then applies to the Hurd is a second question. 

> Usability for serious programmers, who work with the version control
> every day; have their individual workflows, and need to do untypical
> things sometimes.

"... and want things to just work" makes this fit for Mercurial, too. 

Git has the stronger emphasis on "here are the commands, play with them", 
while Mercurial has the stronger emphasis on "just use it and check the more 
complex things when you need them". 

> We will have to agree to disagree.

I think you're right on this, and I don't mind. 

We have different priorities, so it's just natural that we get to different 
conclusions. 

> > In a large project where you only maintain a small part of the whole
> > tree, you're likely to want to commit only the parts
>
> By that definition, Hurd must be a large project indeed... ;-)

Sorry, I trailed off at the end of the sentence... this should have read: 

... only the parts you directly specify instead of all the parts you worked 
on. 
And in Git that is easier, because you say "git add file" to make git track a 
current snapshot of the data in the file, while updating and commiting all 
takes a bit more typing than in Mercurial. That's why I say Git is optimized 
for that. 

(besides: Mercurial has the record extension for doing that fine grained: 
Accept or reject every single change in your working copy. Just say
$ hg record
and it asks you about each file if you want to include it, and then about each 
change in that file. 
I assume Git can do that, too.)

> Partial commits are useful if you have several individual changes in

Sorry for breaking off the sentence too early (I sent it just before I had to 
leave for university). 

I use partial commits, when working on texts to group only related changes 
together with one commit message, so I know their merits. 

And this is a reply to 

> I fail to see anything specialized in Git's interface, for big projects or 
otherwise.

To show, that it is optimized for some workflows, just like every other 
efficient system. 

When you use git, you'll very likely have people rewriting their history to 
make their commits look prettier, while with Mercurial you'll very likely 
have people pulling and reviewing changes from each other before pushing them 
into the shared repository. 

> Shorthand aliases are certainly *not* easier to learn.

$ svn up 
directly translates to
$ hg up

> I on my part seldom type long commands by hand -- usually I get them
> from shell history. For things I do really often, I can always create a
> shell alias. You could alias "gca" to "git-commit -a" and "gu" to
> "git-checkout -m" for example... That beats even the hg variants ;-)

And is possible to do with hg just the same, but in Git you have to do it 
yourself instead of having the comfortable commands supplied by default. 

> workflow; it's simply impossible to map the commands from one to the
> other. The "hg up" example demonstrates this quite clearly.

Yet, I can explain in about three sentences how to work with Mercurial, when 
you already know SVN. 

> (Well, perhaps it was rather stupid of me, and I could have guessed what
> "hg up" does if I thought about it more... But at least it shows that
> it's not really intuitive.)

That's why I said "people can use it, when ... pull ... push ... 
recommendations (at the bottom of the output of some commands)". 

> So, "git-checkout" is actually very similar to "cvs checkout" in that it
> can do both. (Except that "cvs checkout" automatically merges, while
> "git-checkout" requires giving "-m" explicitely.) This really makes
> sense: Both actions check out stuff from the repository to the working
> copy, only that in the second case local changes are merged.

Which is the same difference as for "svn up" and "hg up". 

But checkout doesn't work the same in subversion, which is what I was used to 
before switching to Mercurial. 

"svn checkout" gets a repository onto your disk (like "git clone"/"hg clone") 
and "svn update" updates the data (like "git checkout"/"hg update"). 

But "svn update" and "cvs update" both update the working directory. 

> Of course, it would be possible to introduce "git-update" as an alias
> for "git-checkout -m". But that would only bloat the command set; and
> worse, it would hide the fact that the commands are essentially the
> same, thus preventing the user from gaining a true understanding.

I can't go with this one. 

How is 
$ git checkout

the same as 
$ git checkout .

The former says what I changed, the latter gives me changes I pulled 
beforehand. 

> My whole point was that I do *not* judge by how familiar a tool seems
> after 15 minutes, but how powerful it turns out in the long run.

I do both. 

If a tool doesn't feel familiar after 15 min (or rather after some hours), 
then I will have to wrap myself around the tool, and it is likely to be 
inefficient on the long run, even though I might not even notice it anymore, 
because I got used to it. 

If a tool feels familiar at once, but hinders me in the long run, I'll check 
out other tools. 

A good tool needs to fullfill both goals for me, and Mercurial does that. 

> There are different use cases of course. If I wanted to introduce
> version control for a designer team or for executives or so, I'd
> probably go for Mercurial. But here we are talking about serious
> programmers -- people who are able to understand and appreciate a less
> abstract interface.

Understand: Definitely. 
Appreciate: That depends on the alternatives and the priorities. 

But at least I think we managed to carry this discussion through, even though 
it seems we'll have to agree to disagree. 

And we took a bit longer than the target of "arguing for one week" :) 

I hope the others in this list didn't mind the long posts. 

I learned very much about revision control during the last few weeks (by 
thinking deeper about the different systems and reading up on the net to 
check if what I think is right before posting, as well as directly from your 
posts - especially things I didn't find about Git otherwise), and so I'd like 
to say: thank you for the good discussion, Olaf. 

And also thanks to Ivan, Andrei, Anatoly and Michael - and Thomas who (for me) 
initiated this "week" of arguing. :)

To finish it, I created a small side by side comparision of Git and Mercurial 
which I hope I managed to keep neutral. 

I got the arguments by re-reading all of our posts. 

= Git and Hg comparision =

Unresolved issues: 

** garbage collecting in Git **
- Olaf: It's good, because you can do it, when you want it, and it is very 
efficient and the design gives you more power. 
- Arne: It isn't good to have to do it at all, because I don't want to have to 
care for my tool when there are good alternatives. 

** virtual revision numbers in Mercurial **
- Olaf: You have to look up the revision anyway, so they don't gain much. 
- Arne: They make working far more convenient for me. 


                =Comparision=

        Hg      |       vs              |       Git
        +       | easy to learn |       
                | change history |      +
        +       | portable              |       
        +       | documentation |       
                | speed         |       +
                | unix principles|      +
see below       |       usability       | see below     


                = Usability =

        Hg      |       vs              |       Git

        +       | short commands 
                | and basic 
                | operations 
                | very easy     |        

        +       | commands 
                | similar to SVN 
                | where possible | 

        +       | less error 
                | prone -> don't 
                | have to 
                | understand 
                | every part to 
                | be able to 
                | use it                |             

                | easy rebasing |       +

        +       | just works    |       

We didn't agree on the question how the usability is on the long run, 
since we had only my experience on the difference between both, 
as Olaf didn't yet try Mercurial. 

== Features ==

Git allows for easier rebasing (changing history: git rebase), 
while Mercurial allows for easily sharing a personal repository 
with a lightweight integrated http server (hg serve). 

Arne: Git sadly doesn't install on my Hurd in qemu, but I'm sure that can be 
resolved somehow. 

Both are very powerful, but Git focusses more on users who want to learn it in 
depth, while Mercurial focusses more on users who want something that just 
works and has very few pitfalls, so they can read up on advanced features 
later on. 

What's left is finding out, which one will be best for the Hurd, and I think 
neither Olaf nor me are qualified to really "vote" on that (at least I am 
not, because I mostly work on the Hurd wiki, not on its code). 


So I pass that question to the main Hurd contributors, now. 


Best wishes, 
Arne
-- 
Unpolitisch sein
Heißt politisch sein
Ohne es zu merken. 
- Arne Babenhauserheide ( http://draketo.de )

-- Weblog: http://blog.draketo.de
-- Infinite Hands: http://infinite-hands.draketo.de - singing a part of the 
history of free software. 
-- Ein Würfel System: http://1w6.org - einfach saubere (Rollenspiel-) Regeln

-- Mein öffentlicher Schlüssel (PGP/GnuPG): 
http://draketo.de/inhalt/ich/pubkey.txt
signature.asc
Description: This is a digitally signed message part.
[Prev in Thread]
Current Thread
[Next in Thread]
Re: Revision control, (continued)
Prev by Date: Re: The patch of glibc which allows the user to override the pfinet server
Next by Date: Re: The patch of glibc which allows the user to override the pfinet server
Previous by thread: Re: Revision control
Next by thread: Re: Revision control
Index(es):
- Date
- Thread