[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Monotone-devel] scalability
From: |
graydon hoare |
Subject: |
[Monotone-devel] scalability |
Date: |
04 Nov 2003 10:52:36 -0500 |
User-agent: |
Gnus/5.09 (Gnus v5.9.0) Emacs/21.2 |
hi,
I've been playing with scalability and performance a bit more
recently, and have achieved a couple happy results: first, having
isolated the speed issue in cvs importing, I can pull in the gcc
repository without bringing my machine to its knees:
$ time ../monotone --db=import.db cvs_import ~/src/gcc-cvs/gcc-cvs/gcc/
monotone: [file branches: 65964] [tree branches: 9599] [versions: 306816]
symlink-tree,v
monotone: phase 1 (version import) complete
monotone: [branches: 9597] [edges: 115264] [file branches: 65965] [tree
branches: 9599] [versions: 306820]
monotone: phase 2 (ancestry reconstruction) complete
real 328m6.030s
user 282m7.100s
sys 4m54.270s
$ ../monotone --db=import.db db info
schema version : f042f3c4d0a4f98f6658cbaf603d376acf88ff4b
full manifests : 1
manifest deltas : 98543
full files : 20385
file deltas : 191994
$ find /media/src/gcc-cvs/gcc-cvs/gcc -type f | wc -l
23248
so.. about 6 hours to import 90k tree versions, or a sustained 5 tree
states / second, with ~20000 manifest entries in each tree
state. still feels a bit slow when it's happening, but it can be done
in a workday now. flat profiles show RSA and SHA1 functions holding
most of the top slots now, so there's not much else I can do
speed-wise. perhaps another minor jump if I finish porting the p4
multiply8 implementation from msvc.
so, that's cool, I'm immensely happy. but what's *more* interesting is
that I played with some sqlite pager parameters a bit, and got this
result:
$ ls -lh import.db
-rw-r--r-- 1 graydon graydon 648M Nov 4 06:54 import.db
$ du -skh /media/src/gcc-cvs/gcc-cvs/gcc
1.2G /media/src/gcc-cvs/gcc-cvs/gcc
yup, by tweaking the pager parameters the database can be made to
occupy about *half* the space of the corresponding CVS repo (probably
just because the head versions are gzipped). this rule appears to
apply to any moderately large repo; libjava alone has the same
characteristic. it's surprising because I used to be following the CVS
size almost exactly, so we were being *very* wasteful with our sqlite
pages, probably overflowing a lot of page cells.
unfortunately I can't reuse the schema migration stuff to move from
one page size to another; either you got it or you don't. sqlite
refuses to even open a db with a different page size. given the
enormous space savings I'm tempted to commit this new setting, but
I'll need to implement the ascii "db dump" & "db load" commands, to
handle existing DBs, and document the change.
any objection?
-graydon
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Monotone-devel] scalability,
graydon hoare <=