monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] cvs_import rewrite


From: Markus Schiltknecht
Subject: [Monotone-devel] cvs_import rewrite
Date: Wed, 14 Dec 2005 12:06:37 +0100

Hello monotone hackers,

what's up with the cvs_import rewrite branches? Anybody still working on
such a thing?

I have been looking at cvsps, which can extract patchsets from cvs
repositories. Unfortunately, it fails in case of inconsistent CVS
repositories (i.e. the PostgreSQL cvs repo).

Then I gave cvs2svn a try, which complained about a symbol occuring
twice in some files, but that was easy to fix. Then it converted the
whole inconsistent CVS repository into a subversion one. AFAICT getting
most inconsistencies right and handling branches correctly.

cvs2svn uses a multiple passes of processing the cvs repository. Some
store their results in database files (BerkeleyDB, I think it was). The
only drawback I saw was it's inability to do subsequent conversions. So
you have to convert once, then you must use subversion.

>From the output and a quick glance at the source (in python) it does
something like that:
 * parse the cvsroot and its rcs files
 * check consistency of files
 * make up revisions from parsed rcs files
 * check consistency of revisions
 * sort cvs revisions
 * create a cvs revisions <-> svn revisions mapping
 * import the single revisions one after another into a subversion
repository.

Anyway, I thought about reimplementing cvs_import. I would use the
algorithms of cvs2svn to get a more or less consistent view of the cvs
repository. Due to the nature of monotone, it should be easy to improve
the algorithm to be able to handle subsequent imports.

If I'm heading for such a rewrite, what should I be aware of? Would it
be wise to store results from different processing passes in the normal
monotone db? This would help subsequent imports a lot, of course. On the
other hand you then have 'non-monotone-data' in your database, which you
probably want to delete some day. What could be different to cvs2svn?
I.e. you don't absolutely need to sort by date. Overlapping commits in
CVS would better be handled as two heads which later got merged again in
monotone.

My goals with this are:
 * gain speed in subsequent imports
 * (correct) branch support

What do you think?

Regards

        Markus






reply via email to

[Prev in Thread] Current Thread [Next in Thread]