gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gnu-arch-users] Re: [PATCH] arch speedups on big trees


From: Miles Bader
Subject: [Gnu-arch-users] Re: [PATCH] arch speedups on big trees
Date: 07 Jan 2004 12:41:58 +0900

Chris Mason <address@hidden> writes:
> > One file per id?!  That seems insane for large trees (the only case where
> > such optimizations are interesting anyway)...
> 
> There are a number of indexed filesystems to choose from that will
> handle one file per id nicely.  There are no operations that will read
> all the files in ++id-mapping, unless you make a changeset that
> changes every file in the repository, or you are linking the rep.

That doesn't help with the space issue.  For instance, on a UFS
filesystem, the fragment size is typically 512 bytes, so a linux source
tree would have about 8MB of index files; on an ext2 filesystem, it
would be 64MB!

[`Use a different filesystem' is not a good answer -- I use reiserfs
right now, but using techniques that perform poorly on ext2 is not a
good way to become popular.]

> > I assume this is done _only_ when someone specifies a list of filenames for
> > commit, i.e., whole-tree commits still make inode-sigs, right?
>
> Initially I only skipped the inode sig when --quick was passed, but I
> didn't want to get into situations where the inode sig is somewhat out
> of date (think explict commit followed by whole-tree commit).  If you've
> got to inventory the FS to check the sig, then you might as well just
> inventory the whole FS and throw the sig out.
...
> The reverse mapping improves the speed at which changesets are applied,
> and this happens without any special work from the user.  applying
> changesets is a pretty fundamental operation, so making that faster
> improves things across most of arch.

Wait a minute, so you've _entirely disabled_ inode-sigs?  That seems
like a fairly significant lose...

I'm a bit confused -- if you're going to depend on keeping the
reverse-mapping up-to-date, why is that any more reliable than keeping
inode-sigs up-to-date?  Why not just have one big `signature database'
(preferably not `big' in reality of course :-) that includes both inode
and pathname information, and make sure it's always kept as up to date
as possible by all operations?

It just doesn't seem like these things should be at odds with one
another.

> The inode sig could be an indexed file for partial updates.

[Is it really even necessary? -- even reading/writing a big sequential
file probably pales compared to the overhead of doing tons of stats.]

> Changes to make pristine trees hard linkable and replaceable do require
> user work (you have to specify a hook), but that is in the same style as
> hard linkable libraries.  It's a new feature, so the defaults haven't
> really shaken out yet.

BTW, can you expand on the reasons why you want to keep pristine trees
-- they're generally a lot more annoying to manage than revision
libraries, so it'd be nice if they were as unnecessary as possible.  Is
it just locking issues (you can assume you've got sole access to
{arch})?

Thanks,

-Miles
-- 
"1971 pickup truck; will trade for guns"




reply via email to

[Prev in Thread] Current Thread [Next in Thread]