[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Idea for reducing disk IO on tagging operations

From: Dr. David Alan Gilbert
Subject: Re: Idea for reducing disk IO on tagging operations
Date: Sun, 20 Mar 2005 17:31:22 +0000
User-agent: Mutt/1.5.6+20040907i

[Resend: I sent it with the wrong 'from' address - apologies
if you get both]

* Mark D. Baushke (address@hidden) wrote:
> Hash: SHA1

Hi Mark,
  Thanks for your reply.

> Dr. David Alan Gilbert <address@hidden> writes:
> > So - here are my questions/ideas - I'd appreciate comments to tell
> > me whether I'm on the right lines:
> >   1) As I understand it the tag data is the
> >   first of the 3 main data structures in the RCS
> >   file (tag, comments, diffs) and that when I do
> >   pretty much any CVS operation I rewrite the
> >   whole file - is this correct?
> CVS write operations on a foo.c,v repository file
> will write ,foo.c, and then when the write
> operation is successful and without any errors, it
> does a rename (",foo.c,", "foo.c,v"); to make the
> new version the official version. While the
> ,foo.c, file exists, RCS commands will consider
> the file locked.
> It is desirable to use RCS write semanitcs as many
> other tools out there (cf, ViewCVS) use RCS on the
> repository and want to obey RCS locking.

OK, if I create a dummy ",foo.c," before modifying (or create a hardlink
with that name to foo.c,v ?)  would that be sufficient?  Or perhaps create
the ,foo,c, as I normally would - but if I can use this overwrite trick
on the original then I just delete the ,foo.c, file.  Is the problem that
things are allowed to read the original foo.c,v while you are creating
the new version?

> be configured). So, yes, whitespace is mostly
> irelevent between sections.


> >   3) So the idea is that when I add a tag I add
> >   a bunch of white space after the tag (lets say
> >   1KB of spaces split into 64 byte lines or
> >   similar); when I come to add the next tag I
> >   check if there is plenty of white space, if
> >   there is then instead of rewriting the file I
> >   just overwrite the white space with my new tag
> >   data; if there is no space then as I rewrite
> >   the file I add another lump of white space.
> This has the potential to more easily corrupt the
> RCS file if the operation is interrupted for any
> reason.

The act of rewriting adding extra space would be performed using the existing
mechanism (with just some extra add space created in RCS_rewrite);
so that can't be a problem.

So the issue is what happens if the interrupt occurs as I'm overwriting
the white space to add a tag; hmm yes; is it possible to guard against
this by using a single call to write(2) for that?  Is that the problem
you are thinking of?

> It would be more robust to enhance CVS to use an
> external database for tagging information instead
> of putting the tagging information into the RCS
> files directly than to rewrite parts of the RCS
> file and hope that the operation didn't corrupt
> the file along the way.

Sure, seperating the tagging data out is much neater; but what I was
looking for here was a simple speed up which didn't require anything
extra and would be fully compatible with existing tools.

> You may wish to consider looking at Meta-CVS as I
> believe that Kaz keeps a lot of the branching
> information outside of the RCS files already.
> See
> for more details on Meta-CVS.

If I was changing to another tool then I'd have a much larger set of
tools to consider (e.g.  subversion) but I'd rather stick with plain CVS
if I can - I've got clients on lots of (weird) OSs that work via pserver
and an infinite number of scripts built around CVS.

Thanks for the reply,

 -----Open up your eyes, open up your mind, open up your code -------   
/ Dr. David Alan Gilbert    | Running GNU/Linux on Alpha,68K| Happy  \ 
\ gro.gilbert @ | MIPS,x86,ARM,SPARC,PPC & HPPA | In Hex /
 \ _________________________|_____   |_______/

reply via email to

[Prev in Thread] Current Thread [Next in Thread]