gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gnu-arch-users] [Fwd: summer of code for the GNU project]


From: Thomas Lord
Subject: [Gnu-arch-users] [Fwd: summer of code for the GNU project]
Date: Tue, 25 Apr 2006 13:55:51 -0700
User-agent: Thunderbird 1.5 (X11/20060313)

I changed my mind.

-t

--- Begin Message --- Subject: summer of code for the GNU project Date: Tue, 25 Apr 2006 13:04:42 -0700 User-agent: Thunderbird 1.5 (X11/20060313)
I volunteer to mentor on behalf of the FSF.

The project will be to help implement Arch 2.0.

I suggest Andy Tai as co-mentor and back-up mentor.  I've
separately asked him if he'll agree to this.

While students may propose any Arch related project, here is
my idea:


In Arch 2.0 I would like to factor Arch into several separate
programs, each of which does one thing well.  I would like to
get closer to the "software tools" strategy -- farther away from
the "big ball of mud".

I think that a student could, in the time allotted, make some
tools that are just a tiny subset of revision control and that
will be useful whether or not Arch 2.0 is ever finished.

I propose a project to implement tools for "tree inventory"
and "whole tree diff and patch".

Tree inventory tool:  a tree inventory examines the
contents of a tree and distinguishes "important" files
from "discardable" files.   For example, if the tree is a
C program, the ".c" and ".h" files are important but the
".o" files and Emacs back-up files ("*~") are "discardable".

The inventory tool also assigns a logical ID to files such
that that ID is independent of the file name.  If you
rename "foo.c" to be "bar.c", the inventory tool should
say before hand that "foo.c" has logical identity X and,
after, that "bar.c" now has logical identity X.

There should be flexible ways for a user to assign
logical identities to files.

Directories, symbolic links, and special files should
be able to have logical identities.

The whole tree diff and patch:  A traditional recursive
diff compares files that have the same name in both
trees and doesn't compare directories at all.   An arch
recursive diff and patch should be based on the
logical IDs of the inventory too.  If, in tree A, the
file with ID X is called "foo.c" and in tree B the file
with ID X is called "bar.c", Arch's whole tree diff
should know to compare "foo.c" to "bar.c".   If we
apply the resulting patch to a third tree, in which the
file with ID X is called "baz.c", Arch's whole tree patch
should know to apply the differences to "baz.c".
If comparison of two trees reveals that "foo.c" has
been renamed to "bar.c", then applying that patch to a
tree that still has "foo.c" should cause the file to be
renamed "bar.c".

Now, Arch 1.x already had these features but there are
problems with their implementation.   In Arch 1.x,
these features aren't available as separate tools -- you
would have a tricky time mixing them cleanly with
`git', for example.   And in Arch 1.x, people don't
much like the syntax and semantics of the various
control files that are used.  And in Arch 1.x, the
implementation does not have the greatest performance.
The problems in 1.x are hard to fix incrementally
because of a need for backward compatibility.

There is an opportunity to implement these features
cleanly -- from scratch.   To not worry *too* much about
backwards compatibility.   Just to take the good ideas and
implement them in a solid form.   With a little guidance,
a talented student could do this in a couple of months.
The result will be useful for Arch 2.0 but should also
be useful to users of `git', `Subversion', and other systems.

For a student, this is a good chance to get exposure to
fundamentals of coding to POSIX standards -- writing
nicely portable code.   It is good chance to practice
using basic system calls in a context that requires
understanding them in depth -- like "fielding grounders"
practice in baseball.   This is also a great chance for a
student to have a hand in making the good ideas in
Arch more widely adopted.

As mentors, I think we should focus on fundamentals.
We'll make sure that code is formatted nicely.  We'll
make sure that code doesn't have "beginner's bugs".
We'll make sure that the detailed feature design -- which
will be largely up to the student -- is informed by the
wisdom of experience.   In short, we'll do our best to help
the student write some really great shell tools in a classic
style.

I think we should also insist on documentation.  It will be more
important that the student document features implemented than
add new features.

A nice property of this project is that there are multiple
levels at which victory can be declared.   If a student
does a great job with just "inventory" but fails to complete
whole tree patching?   That's quite alright -- a useful tool
has still been produced.   On the other hand, if a student
just sails through these tasks and we need more for the
student to do?  No sweat: we also need a tool for computing
signable tree fingerprints and much more -- there's dozens
of follow-up possibilities.

I can commit to 4-8 hours/week for this with a caveat.
The caveat is that my condition of poverty means that
we can not ignore the possibility I will have to flake out
at some point -- a co-mentor and back up is important.

What do you think?


-t




--- End Message ---

reply via email to

[Prev in Thread] Current Thread [Next in Thread]