bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gnulib] Re: coreutils-6.2: various runtime problems on Darwin-8


From: Bruno Haible
Subject: Re: [bug-gnulib] Re: coreutils-6.2: various runtime problems on Darwin-8.7.0 HFS+ (including attachment this time)
Date: Wed, 27 Sep 2006 14:39:59 +0200
User-agent: KMail/1.9.1

Jim Meyering wrote:
> I'll use 180.
> The lower we go, the more of a performance penalty
> we impose for directories with very many entries.

I tried the value 180. It worked fine in some cases, but still failed in
others:

$ tar xf /Volumes/ExtData/bin.x86-linux/cross/cross-hppa.tar.gz
$ ll cross/hppa-linux/share/i18n/charmaps | wc -l
195
$ rm -r cross
rm: Entfernen von Verzeichnis „cross/hppa-linux/share/i18n/charmaps“ nicht 
möglich: Directory not empty
$ ll cross/hppa-linux/share/i18n/charmaps | wc -l
17
$ rm -r cross

Actually the number of files that can be removed before stumbling on the
bug depends on the length of the filenames. Here it a command that
generates a table. First column: l, the length of an additional suffix tacked
on every filename. Second column: n, the number of files that can be removed
without hitting the bug.

$ for l in `seq 0 200`; do
    suffix=`printf "%${l}s" | tr ' ' 'S'`;
    tar xf /Volumes/ExtData/bin.x86-linux/cross/cross-hppa.tar.gz;
    if test $l != 0; then
      (cd cross/hppa-linux/share/i18n/charmaps;
       for f in *; do mv $f $f$suffix; done
      );
    fi;
    before=`ls -l cross/hppa-linux/share/i18n/charmaps | wc -l`;
    rm -rf cross 2>/dev/null;
    after=`ls -l cross/hppa-linux/share/i18n/charmaps | wc -l`;
    removed=`expr $before - $after`;
    while ! rm -rf cross 2>/dev/null ; do : ; done;
    printf '%3d %3d\n' $l $removed;
  done

 l   n
--- ---
  0 178
  1 174
  2 170
  3 156
  4 152
  5 150
  6 148
  7 138
  8 135
  9 134
 10 131
 11 122
 12 120
 13 119
 14 117
 15 109
 16 108
 17 107
 18 105
 19  99
 20  97
 21  96
 22  95
 23  90
 24  88
 25  87
 26  86
 27  82
 28 115
 29  80
 30  79
 31  76
 32  75
 33  74
 34  73
 35  70
 36  69
 37  69
 38  67
 39  66
 40  65
 41  64
 42  63
 43  61
 44  60
 45  60
 46  59
 47  58
 48  57
 49  57
 50  56
 51  55
 52  54
 53  54
 54  53
 55  52
 56  51
 57  51
 58  50
 59  49
 60  49
 61  48
 62  48
 63  47
 64  46
 65  46
 66  46
 67  45
 68  44
 69  44
 70  43
 71  43
 72  85
 73  42
 74  42
 75  41
 76  41
 77  40
 78  40
 79  39
 80  39
 81  39
 82  38
 83  38
 84  38
 85  37
 86  37
 87  36
 88  36
 89  36
 90  36
 91  35
 92  35
 93  35
 94  35
 95  34
 96  34
 97  34
 98  68
 99  33
100  33
101  33
102  32
103  32
104  32
105  32
106  31
107  31
108  31
109  31
110  30
111  30
112  30
113  30
114  30
115  29
116  29
117  29
118  29
119  28
120  28
121  28
122  28
123  28
124  27
125  27
126  27
127  27
128  27
129  27
130  26
131  26
132  26
133  26
134  26
135  25
136  25
137  25
138  25
139  25
140  25
141  25
142  24
143  24
144  24
145  24
146  24
147  24
148  23
149  23
150  23
151  23
152  23
153  23
154  23
155  22
156  22
157  22
158  22
159  22
160  22
161  22
162  22
163  22
164  21
165  21
166  21
167  21
168  21
169  21
170  21
171  21
172  21
173  20
174  20
175  20
176  20
177  20
178  20
179  20
180  20
181  20
182  20
183  19
184  19
185  19
186  19
187  19
188  19
189  19
190  19
191  19
192  19
193  19
194  19
195  18
196  18
197  18
198  18
199  18
200  18

The initial directory contents is:
    3 files of length 5
    4 files of length 6
    3 files of length 7
    6 files of length 8
   64 files of length 9
   10 files of length 10
    8 files of length 11
   21 files of length 12
   18 files of length 13
   13 files of length 14
    9 files of length 15
   12 files of length 16
    6 files of length 17
    1 files of length 18
    6 files of length 19
    5 files of length 20
    1 file  of length 21
    2 files of length 22
    1 file  of length 23
    1 file  of length 26
  ---
N=194 files total, sum of lengths S = 2315

It does not require higher mathematics to see that n can essentially be
expressed through the formula

  n = 780000 / (S + l * 194 + 10 * 194)

or

  n = 4020 / (S/N + l + 10)

Recalling that S/N + l is the average length of a filename in the set, one
can nearly guess that the readdir bug occurs usually after a single "block"
of directory entries has been filled, where a block is a little less than
4096 bytes and a directory entry consists of the filename, 2 bytes of
alignment and 8 bytes of data.

Thus, instead of testing whether the number of directory entries since the
last rewind() exceeds a fixed number, a better test is probably whether

    (total length of file names since last rewind()
     + 10 * number of directory entries since last rewind())
    > a fixed number such as 3000

Bruno




reply via email to

[Prev in Thread] Current Thread [Next in Thread]