gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] [PATCH] arch speedups on big trees


From: Tom Lord
Subject: Re: [Gnu-arch-users] [PATCH] arch speedups on big trees
Date: Fri, 19 Dec 2003 12:56:06 -0800 (PST)

    > From: Andrew Suffield <address@hidden>

Thanks and "here, try this":

    > On a smallish tree with 740 files, here's the top of strace -c when I
    > apply about 20 changesets in one run:

    > % time     seconds  usecs/call     calls    errors syscall
    > ------ ----------- ----------- --------- --------- ----------------
    >  33.87    0.505745           6     79916           write
    >  15.30    0.228506           6     40300      3239 lstat64
    >  15.04    0.224609           7     33808        18 open
    >  10.63    0.158737           3     55474           read
    >   6.12    0.091420           3     33837           close

    > (Total runtime, about 3 seconds; percentages are of the total syscall
    > time, not the total runtime)

(Reordering things:)

    > The writes are mostly writing inode-sigs files (why is it writing so
    > many? looks to be one-per-changeset-applied).

Oh dear.  Well, that's probably utterly trivial to mostly fix.

  ./src/tla/libarch/inode-sig.c(arch_create_inode_sig_file)

needs a call to ./src/hackerlab/vu/safe-vfdbuf.c(safe_buffer_fd).
Care to try that out?  (bufsize and buffer_flags can safely be passed
0.  flags is just O_WRONLY).  (There might be a couple of other cases
here and there where buffering ought to be turned on.)

The actual math is:

  20 changesets implies creation of 20 inode signature files

  sans buffering, each file means calling write(2) 5 times
  while making an inode signature file (printfmt "%s%s" twice
  plus printfmt "\n")

  20 * 740 * 5 = 74000

leaving 5K writes (~ 7.4% of those you saw) unaccounted for.

Those 74000 calls to write will hopefully be replaced by something in
the neighborhood of 600 if you turn on buffering using the default
buffer size.  (Closer to 20 if you instead use a big buffer size --
I'm guessing that about a 128K buffer would bottom out (the default
being 4K).)

As for "why one inode signature file per changeset applied?": strictly
speaking, you're right that that isn't logically necessary -- but it
sure is simpler.   Just turn on buffering.


    > The lstat64s are more or less entirely inode-sigs checks, 

Those are harder to kill.  Those and directory reading are what
mason's hack and proposed --quick flag are in the direction of
killing.


    > and the open/read/closes are mostly reading explicit tag files.

This would, judging by the number of `open' errors, be a mostly if
not entirely explicitly tagged tree?  Anyway ...

Those are not quite trivial to get rid in the way previously planned
but not that hard either (given an inode-sig cache).  That is on the
1.2 list.

(Mason's hack would _also_ get rid of most of those in some cases but
is redundent in that regard.)



    > All of these things are happening too damn often, 

Summing up: 

Please try adding that safe_buffer_fd call and see if that doesn't
knock out a big chunk of those calls to `write' and thus approaching
~1/3 of of the time.

Extending the inode-signature-implies-inventory-id hack to cover
explicit tags will knock out a great deal of the open/read/close
triples.  

Those two will shave off approaching ~2/3 of your time.

mason's hack can probably bump that up to approaching ~5/6.

Combined with the resulting kernel-cache efficiencies -- your
intuition that (another) order of magnitude savings is quite
achievable seems right to me too.

How many orders of magnitude will we have come from the early days of
larch?  Somewhere in the 3-5 range, I think, though I haven't kept
careful track.  :-)


    > and even on a smallish tree they're taking up a fair amount of
    > time. On a really big tree, like linux, they're going to
    > dominate. For long series of changeset application, it's going
    > to be *horrible*.

That's never been in dispute and even _after_ these improvements will
_still_ be an issues -- which is why so much of what gets optimized is
how to eliminate "long series of changeset application" for work-a-day
developers.

-t





reply via email to

[Prev in Thread] Current Thread [Next in Thread]