[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Idea for reducing disk IO on tagging operations
From: |
Dr. David Alan Gilbert |
Subject: |
Idea for reducing disk IO on tagging operations |
Date: |
Sun, 20 Mar 2005 15:30:16 +0000 |
User-agent: |
Mutt/1.5.6+20040907i |
Hi,
I maintain a system that is used to hold a rather large
CVS repository (~1GB give or take) which could do with being faster.
Tagging in particular is slow and I don't think cpu or ram is the
issue (it is a dual xeon with 3GB of RAM).
My suspicion is that at least one of the problems is that when
a tag is added most of the rcs files are rewritten giving a sudden
large amount of data that must be written to disc.
So - here are my questions/ideas - I'd appreciate comments to tell
me whether I'm on the right lines:
1) As I understand it the tag data is the first of the 3 main
data structures in the RCS file (tag, comments, diffs) and that
when I do pretty much any CVS operation I rewrite the whole file -
is this correct?
2) White space appears to be irrelevent in RCS files; so adding
arbitrary amounts in between sections should leave files still
fully compatible with existing RCS/cvs tools.
3) So the idea is that when I add a tag I add a bunch of white
space after the tag (lets say 1KB of spaces split into 64 byte
lines or similar); when I come to add the next tag I check if
there is plenty of white space, if there is then instead of
rewriting the file I just overwrite the white space with my
new tag data; if there is no space then as I rewrite the
file I add another lump of white space.
4) Whether dummy white space is added and how much is controlled
by the existing size of the RCS file; so an RCS file that is only
a few KB wont have any space added; that way this mechanism doesn't
slow down/bloat small repositories. The amount of white space might
be chosen to align data structures with disk block boundaries.
5) My main concern is to do with concurrency/consistency requirements;
is the file rewrite essential to ensure consistency, or is the
locking that is carried out sufficient?
Does this make sense?
Dave
-----Open up your eyes, open up your mind, open up your code -------
/ Dr. David Alan Gilbert | Running GNU/Linux on Alpha,68K| Happy \
\ gro.gilbert @ treblig.org | MIPS,x86,ARM,SPARC,PPC & HPPA | In Hex /
\ _________________________|_____ http://www.treblig.org |_______/
- Idea for reducing disk IO on tagging operations,
Dr. David Alan Gilbert <=