gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] [PATCH] arch speedups on big trees


From: Chris Mason
Subject: Re: [Gnu-arch-users] [PATCH] arch speedups on big trees
Date: Tue, 06 Jan 2004 20:57:48 -0500

On Tue, 2004-01-06 at 17:44, Miles Bader wrote:
> On Tue, Jan 06, 2004 at 04:20:19PM -0500, Chris Mason wrote:
> > 1) Maintain a reverse mapping of ids to the objects that own them.  This
> > lives in project_root/{arch}/++id-mapping, one file per id.
> 
> One file per id?!  That seems insane for large trees (the only case where
> such optimizations are interesting anyway)...
> 

There are a number of indexed filesystems to choose from that will
handle one file per id nicely.  There are no operations that will read
all the files in ++id-mapping, unless you make a changeset that changes
every file in the repository, or you are linking the rep.

My goal for the patch was to demonstrate that reverse mappings work. 
The actual implementation of the reverse mapping could be some kind of
indexed file.  Anything that isn't order N for searches, insertions and
deletions would be fine.

> > 3) Avoid inode signatures for everything except library revisions. 
> > Since taking an inode signature involves a whole tree inventory, we
> > should only take them when we know we're going to read them at least
> > twice before snapping them again.  Otherwise, the inode sig is a net
> > loss in speed
> 
> I assume this is done _only_ when someone specifies a list of filenames for
> commit, i.e., whole-tree commits still make inode-sigs, right?
> 
Initially I only skipped the inode sig when --quick was passed, but I
didn't want to get into situations where the inode sig is somewhat out
of date (think explict commit followed by whole-tree commit).  If you've
got to inventory the FS to check the sig, then you might as well just
inventory the whole FS and throw the sig out.

Maybe I missing something about the signatures, though.

> Actually even in the restricted-commit case, it seems like you could update
> inode-sigs by simply reading an old inode-sigs file (if there is one), and
> updating it to include the commited files.
> 

The inode sig could be an indexed file for partial updates.

> [I've got to say I'm a bit worried by my vague impression I'm getting from
> this dicussion: it seems like adding lots of little grotty hacks to speed up
> specific special cases -- with the onus on the _user_ to give the magic
> incantations required to hit the sweet spot -- and
> ignoring/removing/screwing-with more general optimizations.
> 

If there are suggestions for more general optimizations, I'm all ears ;)

The reverse mapping improves the speed at which changesets are applied,
and this happens without any special work from the user.  applying
changesets is a pretty fundamental operation, so making that faster
improves things across most of arch.

Changes to make pristine trees hard linkable and replaceable do require
user work (you have to specify a hook), but that is in the same style as
hard linkable libraries.  It's a new feature, so the defaults haven't
really shaken out yet.

Changes to send an explicit file list to tla commit are aimed directly
at scripts.  Without the explicit list, I got the commit time down to
20-30 seconds on my linux kernel tree (16 without the tree-lint), which
is pretty much fast enough when you're just doing one commit.  Even
unpatched arch only took 40-60 seconds most of the time.

But when you've got a script that commits 100 or 1000 patches, you need
a subsecond commit time.  So I'm making the script do some of the work,
but it's the non-general case.  I've got example scripts that commit
patches if people are interested, they don't deal with renames but they
do make use of tla commit --file-list.

-chris






reply via email to

[Prev in Thread] Current Thread [Next in Thread]