Re: Revision control

bug-hurd
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Revision control

From:	olafBuddenhagen
Subject:	Re: Revision control
Date:	Sun, 29 Jun 2008 06:21:16 +0200
User-agent:	Mutt/1.5.18 (2008-05-17)
Hi,

On Sat, Jun 28, 2008 at 12:25:36PM +0200, Arne Babenhauserheide wrote:
> Am Samstag 28 Juni 2008 04:24:50 schrieb olafBuddenhagen@gmx.net:

> > Well, as you seem to have used Mercurial much more and longer, that
> > isn't really surprising :-) If I learned Mercurial now, I'd most
> > likely hold exactly the opposite opinion... That doesn't really tell
> > much.
> 
> Please tell me how it works for you, once you have to use it. 

I'll try to keep it in mind... But no promises :-)

> > Actually, this is much easier with a sanitized history. It's always
> > clear when and why a change was introduced -- 
> 
> It is only clear that it happened during "implementing feature x", but
> not that it happened during "hacking that damn network table - quick
> fix to get it working". 

Sanitizing history is about purging stuff that doesn't matter for the
end result; not about dropping important information. If people don't
understand the importance of proper commit messages, you are screwed, no
matter what workflow.

> > while otherwise, with omission and later amends, mistakes and later
> > reverts, prototypes and later cleanups, experiments and later
> > switching approches, it is much harder to make sense of it. It's
> > just useless data getting in the way.
> 
> You just let your system search through a batch changesets at once,
> then it looks the same for the reviewers as modified history,

?

> but it offers more information if you need it. 

You never need meaningless information :-)

> I came up with the group extension: 
[...]
> - Enable me to put changesets into a group. - Hide all grouped
> changesets from the log and show their groups instead. - Only look at
> groups. - Also group groups. 

This is an interesting feature :-)

But it doesn't alleviate the fact that garbage in the history is a bad
thing, period.

> The changes are what you get as result. The storage is just the
> technical solution and only matters, where it affects (means: limits
> or forces to modify) interactions with the changes. 
[...]
> Naturally, it does necessarily affect them at every stage, but that
> should be limited to speed of actions (and that also as little as
> possible), and it should not force me to do a certain action

Well, there are two aspects to the Git storage: How the objects are
stored on disk (files and packs); and how the history is represented by
objects.

The first aspect affects the user very little, except by the fact that
repacking needs to be triggered manually.

The second aspect is infinitely more interesting: The fact that the
object structure is extremely simple, and represents the history very
directly; that it is absolutely flexible in how it can be used, all the
specifics of revision control being merely conventions. The fact that
the user works on it very directly, with hardly any abstraction; can do
anything he likes with it; can easily gain a deep understanding of what
is going on, what effect his actions have exactly -- not at some
abstract level, but on the actual object structure; always knows what is
possible, how things fit together, how to reach a desired result; always
has full confidence, knowing nothing unexpected can happen.

In short: Everything that make Git so wonderful.

Note that "git-gc" not only repacks, but actually does much more: Most
notably, pruning unreferenced objects. This is a very fundamental
property: Git never ever deletes something that once made it into the
repository or the index, unless explicitely asked by the user to prune
(directly or through "git-gc"). All normal actions the user performs
only create, update, or drop references -- which can easily be
recovered. (Manually or through the reflogs.)

This is so fundamental, because it means that no matter how much you
screw up your repository, you can always fix it; you never really lose
anything. Once users understand that, they never need to be afraid of
doing something wrong.

This is also part of what makes Git so lovely. (Once you understand how
it works...)

> You can't know every nuance from the beginning without losing a vast
> amount of time. 

You don't need to know every nuance. You only need to understand how
things fit together. Then you will know where to look whenever you need
something specific.

> I learned about that on much lower scale from my wife who works in a
> bank. Some of her collegues are simply computer DAUs (dumbest
> thinkable user - dümmster anzunehmender User), who only do exactly
> what they were told to do, because they fear that they could break
> anything if they do something wrong. 
> 
> I know this sound like encouraging to "force every user to learn", but
> rather the opposite is true: As soon as the basic usage is (or just
> feels) so complicated that a user gives up on understanding it, trying
> to "force people to become experts" makes them DAUs instead, who don't
> feel comfortable in their own environment. 

This doesn't really have anything to do with basic usage being
complicated or not. What creates this extreme anxiety is a lack of
understanding how things work, which makes it impossible to predict the
effect of any unknown action, any new situation. This is very tough for
people who otherwise have very little to do with computers -- it's just
an enormous amount of things they would need to learn about computers,
about software, about operating systems etc.

But luckily that's not the target audience for Git. Git users are
programmers -- people who already know how computers work, how software
works, how operating systems work, how data structures work, how UNIX
works etc. Learning the fundamentals of Git, gaining a deep
understanding how it works, shouldn't take more than a couple of hours.
(If you start reading at the right place at least...)

> > Or perhaps it is considered less annoying as a default action.
> > (Commiting too much by accident if "-a" was default would be worse
> > than the command simply failing if forgetting the "-a" as it is now.
> > It's actually something that bothered me with CVS quite often.)
> 
> At least for me, having to add the -a wouldn't help me remembering. 
> 
> After having adde it about 20 times, my hands would do it
> automatically without me even noticing it. 

Well, if you use a workflow where you *always* do -a, it doesn't matter
anyways. If on the other hand you do partial commits now and then, i.e.
use both variants, there should be no problem.

> Git on the other hand also has these easy commands, but they don't
> look consciously added to me, but historically grown. 

Indeed, it is clear that some things have historically grown. Yet it is
remarkably logical and consistent, once you take the trouble to
understand it...

> hg update <branch>
> 
> the same as 
> 
> hg update <tag>
> 
> and 
> 
> hg update <revision number, hex or short>
> 
> They update the working repository to some specific state (and in the
> case of a branch that state just didn't only differ linearly but went
> "sideways", too). 

So the same nonsense as in CVS.

(Don't tell me "update" makes more sense than "checkout" when going back
in history or switching branches... It just doesn't.)

> > "git-checkout" requires getting used to, but it's a fact that it is
> > more consistent and logical, and helps in the long run. Which IMHO
> > is true for many things in the Git UI.
> 
> In that aspect it seems similar to "hg update", but "git checkout" has
> some nasty pitfalls. One of them already ate several hours of my time
> - I already talked about that one in here.

The pitfalls only exist if you are coming with wrong assumptions based
on other systems. There are no pitfalls if you actually understand what
"checkout" does in Git -- it's different than other systems, but
perfectly logical and consistent in itself.

> What happened was: "git checkout" worked quite well. It didn't warn or
> anything. It just said "there are missing files, sucker." 

Wrong. It told you that it can't check out, because some files are
locally modified (by deletion).

> "git checkout ." on the other hand gave me back the files. 
> 
> Technically these two might be quite similar, but from the user
> interaction standpoint, they are vastly different. 

The result of an error condition is always vastly different from
successful operation...

> > "Ich weiß nicht, ob es besser wird, wenn es anders wird, ich weiß
> > nur, dass es anderes werden muss, wenn es besser werden soll." --
> > Georg Christoph Lichtenberg
> 
> That's what I added with the second part: If it proves to be bad on
> the long term, it isn't good either. But a tool should manage both,
> else it's very likely that it skews my perseption. 
> 
> And not every different is better, though it's almost always useful to
> try different things to find out, where to change next. 

The point is that without change, things can never get better.

Git does change some things in the UI, forsaking CVS compatibility, so
it can make them better. "git-checkout" is a formidable example of that.

> > I don't fully agree on your conclusions, but more importantly, I
> > don't agree at all to what constitutes "usability" in your list :-)
> 
> Could you create a different list, so we can merge (or get two lists
> for different sets or priorities)? 

Not without some serious consideration... Perhaps I will think about it
some time over the next few days, or perhaps I won't. No promises.

> > And what is "just works" supposed to mean, anyways?...
> 
> That means: it didn't yet surprise me with unexpected behaviour, which
> is something I also heared from many other Mercurial users. 

Well, Git didn't surprise me either so far... but I'm aware that this
won't be true for people who don't take the trouble to understand it
first :-)

> Or, maybe this one fits: When free tools which programmers use get
> spread enough (and there isn't something in place to stop people from
> contributing), it will almost always come down to philosophy and basic
> concepts in the end, because everything else can be fixed by throwing
> person-hours at it :) 

That's not entirely true. Certain design choices make some things easier
and some things harder. If something is too hard, it likely won't be
implemented, or very late, or only half-heartedly...

There is for example nothing that can be done on the Hurd, which
couldn't be also done on Linux *somehow*. But some things are just much
much simpler to do on the Hurd...

> Maybe next someone will invent incremental garbage collection for git
> (can be done at each pull),

It seems that some commands actually do automatic repacking when
needed... But I don't know which.

("pull" doesn't really seem like a good candidate -- unless using a dumb
transport, it always fetches a single pack...)

> just like Mercurial gets simpler rebasing at the moment, and step by
> step the projects will push each other onward. I hope they will exist
> side by side for a long time, so they can surpass any proprietary VCS
> in every aspect. 

Well, there is a feature in Perforce -- I sadly don't remember the name
-- which Git most likely will never implement: The ability to completely
erase the source of certain revisions of a file from the repository.

Aside from such controversial features, I'd be much surprised to learn
that Git has *not* yet surpassed any proprietary VCS in every aspect
long time ago...

> Having friendly competition can be a huge gain for a project (if it's
> friendly), since ideas in each project also bear fruits in the other
> one. :) 

Indeed :-)

-antrik-
[Prev in Thread]
Current Thread
[Next in Thread]
Re: Revision control, (continued)
Prev by Date: Re: The patch of glibc which allows the user to override the pfinet server
Next by Date: [bug #19439] ``Kernel page fault at address 0x1d'' in `setup_rw_floppy', `linux/dev/drivers/block/floppy.c:1447'
Previous by thread: Re: Revision control
Next by thread: Re: Revision control
Index(es):
- Date
- Thread