[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Idea for reducing disk IO on tagging operations

From: Dr. David Alan Gilbert
Subject: Re: Idea for reducing disk IO on tagging operations
Date: Sun, 20 Mar 2005 23:54:32 +0000
User-agent: Mutt/1.5.6+20040907i

* Mark D. Baushke (address@hidden) wrote:
> Hash: SHA1
> Dr. David Alan Gilbert <address@hidden> writes:
> > 
> > OK, if I create a dummy ",foo.c," before
> > modifying (or create a hardlink with that name
> > to foo.c,v ?) would that be sufficient?
> I would say that it is likely necessary, but may
> not be sufficient.

Hmm ok.

> > Or perhaps create the ,foo,c, as I normally
> > would - but if I can use this overwrite trick on
> > the original then I just delete the ,foo.c,
> > file.
> I am unclear how this lets you perform a speedup.

I only create the ,foo.c, file - I don't write anything into it; the
existence of the file is enough to act as the RCS lock; if I can do my
inplace modification then I delete this file after doing it, if not then
I proceed as normal and just write the ,foo.c, file and do the rename
as you normally would.

> > Is the problem that things are allowed to read
> > the original foo.c,v while you are creating the
> > new version?
> I am given to understand that many of the
> anicillary tools that surround CVS make use of
> being able to read a consistent ,v file at all
> times.

This is very tricky; I don't think in our case we use any such tools
(we might have a cvs/web thing for browsing it, but this is probably
not critical); and as long I can guarentee what I do is safe as far
as CVS itself is concerned I think I'd be prepared to go for it as a
configurable mechanism.

> > So the issue is what happens if the interrupt
> > occurs as I'm overwriting the white space to add
> > a tag; hmm yes; 
> Correct. Depending on the filesystem kind and the
> level of I/O, your rewrite could impact up to three
> fileblocks and the directory data.
> > is it possible to guard against this by using a
> > single call to write(2) for that? 
> Not for all possible filesystem types.
> > Is that the problem you are thinking of?
> Yes. Even worse things can happen in this regard
> if the filesystem is a 'stateless' one such as an
> NFS mounted directory (we keep advising folks
> against using them, but I know for a fact that
> they are still used).

OK, my conscience will let me carefully ignore NFS issues given the
pain it causes me elsewhere (and I make my mechanism switchable).
What happens if I only used the overwrite mechanism if
none of the characters being modified crossed a 512 (e.g.) byte
boundary offset in the file?  Since the spaces were actually
written in a previous operation we can assume that the space
is allocated and no allocation operation is going to happen
at this point (mumble filesystem journalling mumble!).

> > Sure, seperating the tagging data out is much
> > neater; but what I was looking for here was a
> > simple speed up which didn't require anything
> > extra and would be fully compatible with
> > existing tools.
> And you are finding that existing tools torture
> the assumptions you are able to make about the CVS
> repository.

Nod; it is quite painful!

> FWIW: (In my personal experience) using a SAN
> solution for your repository storage allows you
> much better throughput for all write operations in
> the general case as the SAN can guarentee the
> writes are okay before the disk actually does it.

But when you throw a GB of writes at them in a short time from a tag
accross our whole repository they aren't going to be happy - they are
going to want to get rid of that backlog of write data ASAP.

> Optimizing for tagging does not seem very useful
> to me as we typically do not drop that many tags
> on our repository.

In the company I work for we are very tag heavy, but more importantly
it is the tagging that gets in peoples way and places the strain
on the write bandwidth of the discs/RAID.

 -----Open up your eyes, open up your mind, open up your code -------   
/ Dr. David Alan Gilbert    | Running GNU/Linux on Alpha,68K| Happy  \ 
\ gro.gilbert @ | MIPS,x86,ARM,SPARC,PPC & HPPA | In Hex /
 \ _________________________|_____   |_______/

reply via email to

[Prev in Thread] Current Thread [Next in Thread]