bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: fts: Document this module


From: Jim Meyering
Subject: Re: fts: Document this module
Date: Thu, 19 Jan 2023 21:24:01 -0800

On Thu, Jan 19, 2023 at 7:05 PM Paul Eggert <eggert@cs.ucla.edu> wrote:
>
> On 1/19/23 15:41, Bruno Haible wrote:
> > Jim or Paul, what should we state
> > — either in the 'fts' module description, or in the .texi documentation?
>
> The quick thing is to say in both that the description/documentation is
> incomplete, and that people need to read the source code.
>
> Jim may be able to fill in a bit here, since I think he wrote most of
> that stuff. (I haven't checked this though; sorry, I'm a bit crunched
> for time today.)

Thanks for caring/documenting. Here's a quick summary (for more
detail, see the comments in fts_.h).

This started when I found glibc's fts was insufficiently robust to
meet GNU rm's needs (rm was merely the first user; now, many others
use it):
- O(N^2) behavior in the number of file name components due to cycle detection
- max hierarchy depth was 64k due to type of fts_level being a "short"
- subject to O(N^2) effects for directories with many entries (poor
locality of reference, for which the fix was to process entries in
sorted-inode order (per a heuristic), delaying any "stat" until
operating on the entry)

Re fts's cycle detection:
- contrast glibc's O(depth) time algorithm vs our O(1) implementation
- our cheap-but-lazy O(1)-memory approach is ok for most applications, but
- there's an optional, slightly more costly detect-ASAP approach required for du
    (uses O(max-depth-of-hierarchy) memory)


Fixing those things required ABI changes and nontrivial redesign.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]