bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

canonicalize_filename_mode memory usage


From: Sergey Poznyakoff
Subject: canonicalize_filename_mode memory usage
Date: Thu, 17 Dec 2009 10:24:56 +0200

Hello,

A user of tar reported a suboptimal memory usage by the
canonicalize_filename_mode.  Attached is the patch he
proposed.  Any comments?

Regards,
Sergey

This patch corrects highly non-optimal memory allocation by
canonicalize_filename_mode(), which got exposed with:

2009-08-07  Sergey Poznyakoff  <address@hidden>
[...]
        * src/misc.c: Include canonicalize.h
        (zap_slashes, normalize_filename): New functions.

On a specific test case (a tree with around 3500 sub-directories with a
total of around 58,000 files), this reduces tar's memory usage from
32 MB to 4.5 MB for the initial full backup run ("-g" enabled), and from
19 MB to 5.5 MB for a subsequent incremental run.

On a real-world system with around 370,000 directories and 2.3 million
files where the problem was first spotted, an incremental run of a
32-bit build of tar 1.22.90 bumped into the 3 GB process address space
limit and failed, whereas a build with this patch applied uses around
400 MB during incremental runs and around 300 MB during initial full
backup runs.

--- tar-1.22.90/gnu/canonicalize.c.orig 2009-08-07 11:55:47 +0000
+++ tar-1.22.90/gnu/canonicalize.c      2009-12-08 15:50:40 +0000
@@ -161,6 +161,7 @@ canonicalize_filename_mode (const char *
   char const *end;
   char const *rname_limit;
   size_t extra_len = 0;
+  size_t actual_size;
   Hash_table *ht = NULL;
 
   if (name == NULL)
@@ -325,6 +326,10 @@ canonicalize_filename_mode (const char *
     --dest;
   *dest = '\0';
 
+  actual_size = strlen(rname) + 1;
+  if (rname_limit - rname > actual_size)
+    rname = xrealloc (rname, actual_size);
+
   free (extra_buf);
   if (ht)
     hash_free (ht);

reply via email to

[Prev in Thread] Current Thread [Next in Thread]