bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 8/8] fts: do not exhaust memory when processing million-entry


From: Jim Meyering
Subject: Re: [PATCH 8/8] fts: do not exhaust memory when processing million-entry directories
Date: Fri, 19 Aug 2011 13:36:27 +0200

Jim Meyering wrote:
> Paul Eggert wrote:
>> Thanks for all that work to make fts better!  A couple of minor things
>> about comments:
>>
>> On 08/18/2011 06:53 AM, Jim Meyering wrote:
>>
>>> +           into memory at once.  However, When an fts_compar function
>>
>> The "However," can be removed (there are too many Buts etc. in the
>> neighborhood already ...).

Removed.  Thanks.

>>> +           The other conditionals ensure
>>> +           that we are using the *at functions (FTS_CWDFD) and that we
>>> +           are not in no-chdir mode (induced by use of FTS_LOGICAL).  */
>>
>> Are these other conditionals independent of whether we want to avoid
>> putting too many entries in RAM?  If so, perhaps we should remove these
>> other conditionals; if not, it'd help for the comment to explain why not.
>
> Initially I didn't even write that "The other conditionals..." sentence.
> I agree that it doesn't say enough.  The main motivation was to solve
> the problems that were most common, and to avoid getting bogged down in
> the less-common use cases.  Already, this change is defending against
> something very unusual: applying common tools to directories with
> millions of entries.  Worrying about applying "du -L" (FTS_LOGICAL)
> on such a directory is far lower priority.
>
> I'll add a FIXME saying that we should try to remove those conditionals,
> but that doing so will require careful testing of those less-common
> use cases.  Actually, as I write this, I confess I'm thinking that the
> marginal gain is not worth the risk.

However ;-)

Everything appears to work fine even without those two conditionals
(retested via coreutils and find, both with the default
FTS_MAX_READDIR_ENTRIES and using a value of 2, and a few
manual uses of du -L vs. large directories),
so I'm merging in the change below.

Thanks for the prod.


diff --git a/lib/fts.c b/lib/fts.c
index 736350e..e3829f3 100644
--- a/lib/fts.c
+++ b/lib/fts.c
@@ -1329,19 +1329,15 @@ fts_build (register FTS *sp, int type)
               }
           }

-        /* Maximum number of readdir entries to read at one time.
-           This limitation is to avoid reading millions of entries
-           into memory at once.  However, When an fts_compar function
-           is specified, we have no choice: we must read all entries
-           into memory before calling that function.  But when no such
-           function is specified, we can read entries in batches that are
-           large enough to help us with inode-sorting, yet not so large
-           that we risk exhausting memory.  The other conditionals ensure
-           that we are using the *at functions (FTS_CWDFD) and that we
-           are not in no-chdir mode (induced by use of FTS_LOGICAL).  */
-        size_t max_entries =
-          (sp->fts_compar == NULL && ISSET (FTS_CWDFD) && ! ISSET (FTS_NOCHDIR)
-           ? FTS_MAX_READDIR_ENTRIES : SIZE_MAX);
+        /* Maximum number of readdir entries to read at one time.  This
+           limitation is to avoid reading millions of entries into memory
+           at once.  When an fts_compar function is specified, we have no
+           choice: we must read all entries into memory before calling that
+           function.  But when no such function is specified, we can read
+           entries in batches that are large enough to help us with inode-
+           sorting, yet not so large that we risk exhausting memory.  */
+        size_t max_entries = (sp->fts_compar == NULL
+                              ? FTS_MAX_READDIR_ENTRIES : SIZE_MAX);

         /*
          * Nlinks is the number of possible entries of type directory in the



reply via email to

[Prev in Thread] Current Thread [Next in Thread]