monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] [patch] Memory improvements for commit and during synch


From: Eric Anderson
Subject: [Monotone-devel] [patch] Memory improvements for commit and during synchronization
Date: Mon, 13 Jun 2005 01:50:45 -0700

All, 
        I've now spent a while looking into memory usage.  Based on a
test case of a 100MB random file plus everything that gets added when
I do an add after compiling monotone, I can report a 1.11x cpu
reduction on importing, and a 1.43x reduction in memory usage
(assuming minflt is an accurate measure of maximum memory allocated).
        For server mode, I have a 7.05x cpu reduction, and a 1.15x
reduction in memory usage.  For client mode, I have a 2.265x cpu
reduction, and a 1.70x reduction in memory usage.
        I attached detailed results on time use, data copies and
allocations. The build passes all unit and integration tests.

Major changes: 
  1) use of zlib to do compression to get more control over the allocation
     [ slightly better compression and faster on compressible things, 
       much slower on random data ]
  2) new base64 encoding to get more control over memory allocation
     [ quite a bit faster, and could be further optimized ]
  3) elimination of unecessary copies in most cases, and earlier freeing
     of things that won't be used later.
  4) use of the string_queue I provided a patch for earlier

Merging with the sqlite3 branch will get another chunk of savings, but
when I merged, it didn't pass the integration tests, probably because
I got the merge wrong.

I've attached a patch against
7b85d52f5c777c7cdbd2f5b17f284d9b87f6936a; I'm at
7fc5dc7dfcdfdb8bf0d52bac8d5db31b45b00f36 after the patch.  The patch
still has a bit more mess in it than I would like, but I wanted to
give people a chance to look at this.  The patch is missing the
changelog updates, and a fix to the configure.ac file to add in -lz to
the libraries. I can give someone a public key if they would prefer
for me to push the changes somewhere.

There is some more memory usage that could be reduced with work, but
the fact that monotone pushes file data through a pipeline of data ->
gzip -> base64 -> intosql means that even if you are careful, there
are effectively two copies of any one file in memory at any time.  It
may be worth considering some sort of chunking approach to deal with
large files.  What is the target for large files that can be handled?
        -Eric

Attachment: memory-usage.patch
Description: memory-usage patch

Attachment: detailed-results.txt
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]