[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Idea for reducing disk IO on tagging operations

From: Dr. David Alan Gilbert
Subject: Idea for reducing disk IO on tagging operations
Date: Sun, 20 Mar 2005 15:30:16 +0000
User-agent: Mutt/1.5.6+20040907i

  I maintain a system that is used to hold a rather large
CVS repository (~1GB give or take) which could do with being faster.
Tagging in particular is slow and I don't think cpu or ram is the
issue (it is a dual xeon with 3GB of RAM).

My suspicion is that at least one of the problems is that when
a tag is added most of the rcs files are rewritten giving a sudden
large amount of data that must be written to disc.

So - here are my questions/ideas - I'd appreciate comments to tell
me whether I'm on the right lines:
  1) As I understand it the tag data is the first of the 3 main
  data structures in the RCS file (tag, comments, diffs) and that
  when I do pretty much any CVS operation I rewrite the whole file -
  is this correct?

  2) White space appears to be irrelevent in RCS files; so adding
  arbitrary amounts in between sections should leave files still
  fully compatible with existing RCS/cvs tools.

  3) So the idea is that when I add a tag I add a bunch of white
  space after the tag (lets say 1KB of spaces split into 64 byte
  lines or similar); when I come to add the next tag I check if
  there is plenty of white space, if there is then instead of
  rewriting the file I just overwrite the white space with my
  new tag data; if there is no space then as I rewrite the
  file I add another lump of white space.

  4) Whether dummy white space is added and how much is controlled
  by the existing size of the RCS file; so an RCS file that is only
  a few KB wont have any space added; that way this mechanism doesn't
  slow down/bloat small repositories.  The amount of white space might
  be chosen to align data structures with disk block boundaries.

  5) My main concern is to do with concurrency/consistency requirements;
  is the file rewrite essential to ensure consistency, or is the
  locking that is carried out sufficient?
Does this make sense?


 -----Open up your eyes, open up your mind, open up your code -------   
/ Dr. David Alan Gilbert    | Running GNU/Linux on Alpha,68K| Happy  \ 
\ gro.gilbert @ | MIPS,x86,ARM,SPARC,PPC & HPPA | In Hex /
 \ _________________________|_____   |_______/

reply via email to

[Prev in Thread] Current Thread [Next in Thread]