bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#43598: replace-in-string: finishing touches


From: Mattias Engdegård
Subject: bug#43598: replace-in-string: finishing touches
Date: Fri, 25 Sep 2020 12:42:06 +0200

25 sep. 2020 kl. 01.54 skrev Lars Ingebrigtsen <larsi@gnus.org>:

> I went ahead and checked in a new C-level function string-search, which
> should be an efficient way to search for strings in strings (using
> memmem, which Emacs has via Gnulib?), and this fixed these corner cases.

Thank you! Here are some proposed tweaks (diff attached):

1. Check the range of the START-POS argument so that we don't crash.
The permitted range is [0..N] where N is (length HAYSTACK), thus we permit a 
start right after the last character but no further.
We could also return nil in these cases but I think an error is more useful.

2. Make the docs more precise about various things.

3. Slight simplification of the implementation logic to avoid testing the same 
conditions multiple times.

4. More tests, especially for edge cases. Can't have too many!
One test still fails:

 (string-search "ø" "\303\270")

which should return nil but currently matches.
I think it's wrong to convert the needle to unibyte (using Fstring_as_unibyte) 
in this case, but I haven't decided what the best solution would be.

We should also consider the optimisations:
- If SCHARS(needle)>SCHARS(haystack) then no match is possible.
- If either needle or haystack is all-ASCII (all bytes in 0..127), then we can 
use memmem without conversion.

Attachment: string-search.diff
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]