bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gnulib's Knuth-Morris-Pratt implementation (was: memmem issues)


From: James Youngman
Subject: Re: gnulib's Knuth-Morris-Pratt implementation (was: memmem issues)
Date: Thu, 20 Dec 2007 13:30:31 +0000

On Dec 20, 2007 1:27 PM, Eric Blake <address@hidden> wrote:
> Also, I wonder if Boyer-Moore would be any more effective than KMP in the
> average case (both have worst-case linear complexity, but we know that in
> general, we aren't always dealing with worst-case).
>
> http://en.wikipedia.org/wiki/Knuth-morris-pratt
> http://en.wikipedia.org/wiki/Boyer-Moore_string_search_algorithm

Quite possibly.   If we extracted the KMP implementation as a
prepare-then-search interface, and as I mentioned, I've done quite a
bit of work in that area already, we could do this dynamically (or at
least, without necessarily impacting the caller).

I'm sure Bruno thought about this already, though.   The main problem
though is that the B-M algorithm (I guess we'd likely select
Turbo-Boyer-Moore to reduce the comparisons from 3N to 2N) needs to
move backward through the haystack.  That doesn't work well on
multibyte strings.   We could, as I mentioned, build an ancilliary
data structure, but that would mean adding an extra N operations (thus
making the cost ~3N once again, I suppose).

James.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]