[Monotone-devel] GCC and Monotone

monotone-devel
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Monotone-devel] GCC and Monotone

From:	Tom Tromey
Subject:	[Monotone-devel] GCC and Monotone
Date:	18 Nov 2003 17:41:04 -0700
User-agent:	Gnus/5.09 (Gnus v5.9.0) Emacs/21.3.50
I thought I'd send out a rough draft of my "How to use Monotone for
GCC" white-paper.  It should be pretty straightforward for most
Monotone folks.

Please comment and critique.  Eventually I'd like to get this in good
enough shape that it could be used to make the case for switching gcc
to use monotone.

Tom


This is a proposal for using monotone to handle gcc version control.
The reader is assumed to be familiar with gcc development practices in
general.

- link to monotone

Monotone is an interesting new version control system.
http://www.venge.net/monotone/


- gcc practices

[ Not too sure what I intended to put here.
  Suffice to say, CVS is adequate for gcc, but we could eliminate
  existing problems and reduce costs by switching. ]


- Zack once wrote a message describing his desires for a future
version control system for use with gcc.  You can see his list here:

    http://gcc.gnu.org/ml/gcc/2002-12/msg00444.html

Monotone satisfies all these points (and more).  Some of the
requirements are met in a somewhat unusual way, for those used to cvs
development.  For instance, for point 0c, there is no "remote write
operation" (but in monotone all operations have strong cryptographic
integrity).


- other stuff

  - merge costs

There are a couple common kinds of merge that add expense to the gcc
development process.

The first one, merging between branches, is covered above in response
to Zack's requirements.

The second type is merging across institutional boundaries.  Various
entities maintain their own forks of gcc -- most Linux vendors, Apple,
BSD projects, etc.  monotone will facilitate merges across these
boundaries.

At first it might seem like this is against the interests of the gcc
community, which has always tried to have a single release.  However,
these forks already exist, and lowering the cost of merging across
these boundaries should reduce the divergence here.  Also, lowering
merging costs will free up some small amount of time for gcc hacking
(developers who do the merge will have more time to hack, since merges
will be much simpler).


  - patch review costs

I find patch review and approval to be a pain.  We've given many
people write-after-approval access to gcc simply to reduce the burden
of checking in patches.  In those cases where users don't have write
access, there is usually some work involved in applying the patch (and
dealing with patch formatting issues and the like), and occasionally a
to-and-fro as the original submitter asks for the patch to be applied.

With monotone this burden is substantially reduced.  Frequent
contributors can simply post changes in packet form, which can easily
be viewed (with a GUI or with a special mode in your mailer).

When using monotone, a patch approval message is identical to a
commit.  There is no separate step, no fiddling with patch bits
(except for non-monotone-using users -- analogous to those who don't
use cvs today).  A patch can be signed by any number of people without
harm.  Patches can also be rejected using this same mechanism.

A patch can be applied, at any time, to the revision on which it is
based.  This causes no problems for monotone, it simply adds a new
merge obligation.  Instead of rewriting the patch by hand, or dealing
with the .rej files, you can easily use 3-way merging to do this.

So, for instance, if someone new sends a patch (in non-monotone
format) against a pristine copy of gcc 3.5, you can check out 3.5
as-is, apply the patch, commit, and then merge (or not) the resulting
new head.


  - testing costs

Most people have had the problem of doing a "cvs update" in the
morning, only to find that gcc no longer builds.  There have been
various attempts at fixing this, but there are still the occasional
reversion countdowns and disagreements.  For instance, here's a
recent one:

    http://gcc.gnu.org/ml/gcc/2003-11/msg00955.html

With monotone, we could have the various auto-testers sign revisions.
Then users could use monotone's update feature to select known-working
versions.  We would simply need to define the appropriate set of tags.
I picture tags in varying levels of strictness:

* builds on <triplet>
* no regressions against baseline <x> for target <triplet>

(specifying <x> here is an open problem for the time being)

Once these are in place, it is easy to update to whatever you're
interested in.  For instance, if you are mostly working on a target
library, you could always ensure that you would get a compiler that
can actually bootstrap.


  - network costs

Hosting cvs puts a network strain on gcc.gnu.org.  We've had to move
it to a co-lo, we ask people to use anonymous mirrors, etc.  With
monotone this can be greatly alleviated.  Institutions with a lot of
internal use can set up nntp peers and keep a lot of traffic on their
internal networks.

For people or instititutions fetching the gcc repository for the first
time, we can take several approaches.  First, they could download from
a friendly nntp server.  Second, they could simply copy an existing
database from a friend.  And, third, we could offer copies of the gcc
database (perhaps with different amounts of history, from "trunk only,
recent past" up to "everything") for download via BitTorrent.



- potential problems

  - who does the merge?

Monotone doesn't require a commit to be merged into the head
revision -- in fact, there is no single head revision.  So one
question is, who does the merge?

In many cases, merges can be done automatically.  That's because most
commits don't overlap.  However, gcc is a large project with many
developers.  Sometimes changes do overlap.

I recommend that, in most cases, the person committing then fetch new
changes and perform a merge of his new commit with other existing
heads.


  - uberbaum

Monotone will have the "restrictions" feature, which will act
something like cvs modules and let people check out a subset of the
tree.  In our view, when the switch to monotone happens, we should
also erase the artificial separation between the gcc and src trees,
and merge them for real.


  - if I accept your change, I accept your tree

... have to expand on this


  - disk space

Monotone will require that more disk space be available to each
individual contributor, because you'll need both a working copy (as
with cvs) and a database.  In most cases, however, you won't need all
the previous history.  I recommend we distribute pre-built smaller
databases containing the most useful subsets of history; developers
can always fetch more history as needed.

Motivated large institutions could consider rewriting the storage
layer share a database among many people.  This would have no
implications for anyone else, and is not intrinsically difficult.

  - how to rewrite cvsweb urls?

Currently there isn't a web client for monotone.  That's a barrier to
adoption, since the GCC project would want to continue to make the
source tree browseable online.

Converting cvsweb URLs could be done by modifying the monotone
cvs_import script to generate another database that could be used to
rewrite the URLs.  Converting these is important since then we'll be
able to keep the information in bugzilla correct.

  - renaming branches

Monotone encourages the use of hierarchical, DNS-like branch names.
Existing gcc branches will have to be renamed.  With monotone it is
also conventional to use a DNS-like name for a projects trunk.

I recommend we use "org.gnu.gcc" as the name of the trunk.  Other
existing branches will be automatically renamed based on that, e.g.,
"org.gnu.gcc.gcc-3_3-branch".

Future branches will have more natural names, e.g., "org.gnu.gcc.3_5".

Furthermore I recommend we adopt a convention that any gcc developer
can create any number of branches based on his user name, e.g.,
"org.gnu.gcc.tromey.gcj-rewrite".  Some developers will want to use
this mechanism to keep track of unfinished patches of various kinds;
with this approach, it is easy to share an unfinished patch with other
developers.  When a patch is completed, it can easily be merged
elsewhere with "monotone propagate".


  - notification of changes

I recommend that GCC use mailing lists for transport, with a gateway
to nntp.  Then there is no special need for notification, a la cvs.
You simply set up to accept packets from the mailing list.
[Prev in Thread]
Current Thread
[Next in Thread]
[Monotone-devel] GCC and Monotone, Tom Tromey <=
Prev by Date: [Monotone-devel] commit, queue, and post design
Next by Date: Re: [Monotone-devel] sketch of i18n specification
Previous by thread: [Monotone-devel] sketch of i18n specification
Next by thread: [Monotone-devel] 0.8 released, roadmap
Index(es):
- Date
- Thread