bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: fts: avoid O(n^2) ext3/ext4 performance hit on million+-entry dirs


From: Jim Meyering
Subject: Re: fts: avoid O(n^2) ext3/ext4 performance hit on million+-entry dirs
Date: Fri, 26 Sep 2008 10:50:17 +0200

Jim Meyering <address@hidden> wrote:
> Bruno Haible <address@hidden> wrote:
>> Jim Meyering wrote:
>>> +   if (nitems > _FTS_INODE_SORT_DIR_ENTRIES_THRESHOLD
>>> +       && !sp->fts_compar
>>> +       && dirent_inode_sort_may_be_useful (sp)) {
>>> +           sp->fts_compar = fts_compare_ino;
>>> +           head = fts_sort (sp, head, nitems);
>>> +           sp->fts_compar = NULL;
>>> +   }
>>
>> This uses fts_sort, which uses qsort, which is O(n^2) in the worst case.
>> (Even the implementation in glibc can be O(n^2), if it guesses that mergesort
>> takes too much memory.)
>>
>> Couldn't this be avoided by using Paul's mpsort module, and changing fts_sort
>> to allocate 50% more room, as scratch space for mpsort?
>
> Good point!
> That would make a fine improvement, as a separate patch.
>
> The new use of qsort in the code I've just proposed for
> rm would benefit from the same treatment.

On second thought, I'm not so sure.
glibc's qsort function does revert to quicksort, but
only upon failure to malloc a buffer for indirect sorting.
Yet each of fts.c and remove.c is already sorting an array
of pointers, so there's no gain in the indirection.

And even if it does resort to using quicksort, the odds of
non-contrived input (the inode numbers) provoking O(N^2)
performance seem vanishingly small.

However, I note that fts_sort would benefit from a rewrite
to make it use mpsort rather than manually setting up its
array of pointers.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]