[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
canonicalize_filename_mode memory usage
From: |
Sergey Poznyakoff |
Subject: |
canonicalize_filename_mode memory usage |
Date: |
Thu, 17 Dec 2009 10:24:56 +0200 |
Hello,
A user of tar reported a suboptimal memory usage by the
canonicalize_filename_mode. Attached is the patch he
proposed. Any comments?
Regards,
Sergey
This patch corrects highly non-optimal memory allocation by
canonicalize_filename_mode(), which got exposed with:
2009-08-07 Sergey Poznyakoff <address@hidden>
[...]
* src/misc.c: Include canonicalize.h
(zap_slashes, normalize_filename): New functions.
On a specific test case (a tree with around 3500 sub-directories with a
total of around 58,000 files), this reduces tar's memory usage from
32 MB to 4.5 MB for the initial full backup run ("-g" enabled), and from
19 MB to 5.5 MB for a subsequent incremental run.
On a real-world system with around 370,000 directories and 2.3 million
files where the problem was first spotted, an incremental run of a
32-bit build of tar 1.22.90 bumped into the 3 GB process address space
limit and failed, whereas a build with this patch applied uses around
400 MB during incremental runs and around 300 MB during initial full
backup runs.
--- tar-1.22.90/gnu/canonicalize.c.orig 2009-08-07 11:55:47 +0000
+++ tar-1.22.90/gnu/canonicalize.c 2009-12-08 15:50:40 +0000
@@ -161,6 +161,7 @@ canonicalize_filename_mode (const char *
char const *end;
char const *rname_limit;
size_t extra_len = 0;
+ size_t actual_size;
Hash_table *ht = NULL;
if (name == NULL)
@@ -325,6 +326,10 @@ canonicalize_filename_mode (const char *
--dest;
*dest = '\0';
+ actual_size = strlen(rname) + 1;
+ if (rname_limit - rname > actual_size)
+ rname = xrealloc (rname, actual_size);
+
free (extra_buf);
if (ht)
hash_free (ht);
- canonicalize_filename_mode memory usage,
Sergey Poznyakoff <=