bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: new module "mpsort" for faster sorting


From: Paul Eggert
Subject: Re: new module "mpsort" for faster sorting
Date: Mon, 29 Jan 2007 14:47:26 -0800
User-agent: Gnus/5.1008 (Gnus v5.10.8) Emacs/21.4 (gnu/linux)

Jim Meyering <address@hidden> writes:

>> I was surprised to see Jim's report that the C locale made no
>> difference to him, compared to the en_US.UTF-8 locale.
>
> I hope I didn't say that.
> The only locale I tried was "C".

Ah, sorry, I misremembered what you said.  I just now measured with my
60895-file directory and got these figures (user+sys time seconds).

  0.31  C locale, qsort-based ls
  0.14  C locale, mpsort-based ls
  0.49  en_US.UTF-8 locale, qsort-based ls
  0.32  en_US.UTF-8 locale, mpsort-based ls

So on my Debian stable x86 platform and test it appears we save 0.17s
due to the sort algorithm speedup, and the number of comparisons
wasn't much changed (contrary to my guess).

Now that I think about it, my guess was based on Solaris behavior, not
Debian stable behavior.  Solaris qsort uses quicksort internally,
whereas Debian normally uses mergesort internally.  E.g., sorting
100,000 random records requires about 1,500,000 comparisons with
Solaris 10 qsort, but only about 850,000 comparisons with mpsort (or
with Debian qsort).

Hence this change to 'ls' should be an even bigger win on Solaris.  To
test this guess, I measured a similar directory on an old Solaris 8
sparc box we have lying around (UFS, local file system, 19478
directory entries, 400 MHz Ultrasparc, 64-bit 'ls' executables,
compiled with GCC 4.1.1) and got these figures:

  0.42  C locale, qsort-based ls
  0.20  C locale, mpsort-based ls
  0.59  en_US locale, qsort-based ls
  0.31  en_US locale, mpsort-based ls

So on Solaris 8 we get roughly a factor-of-two CPU performance
improvement regardless of locale (though there's slightly more
improvement for the C locale), whereas for Debian it's factor-of-two
only for the C locale, and for the en_US.UTF-8 locale it's only a
factor of 1.5 improvement (because Debian already uses mergesort).

Solaris 8 /bin/ls operates at roughly the same speed as the
qsort-based GNU ls, even though Solaris 8 /bin/ls is running in 32-bit
mode whereas I compiled GNU ls in 64-bit mode.  So for large number of
directory entries it does seem like qsort+strcmp is the bottleneck on
Solaris 8.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]