[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Upcoming loss of usability of Emacs source files and Emacs.

From: Ulrich Mueller
Subject: Re: Upcoming loss of usability of Emacs source files and Emacs.
Date: Thu, 18 Jun 2015 09:08:06 +0200

>>>>> On Thu, 18 Jun 2015, Eli Zaretskii wrote:

>> ;; Ignore accent and umlaut marks when searching.
>> ;; Works for Emacs 19.30 and later.
>> (let ((eqv-list '("aAàÀáÁâÂãÃäÄåÅ"
>>                "cCçÇ"
>>                "eEèÈéÉêÊëË"
>>                "iIìÌíÍîÎïÏ"
>>                "nNñÑ"
>>                "oOòÒóÓôÔõÕöÖøØ"
>>                "uUùÙúÚûÛüÜ"
>>                "yYýÝÿ"))
>>       (table (standard-case-table))
>>       canon)
>>   (setq canon (copy-sequence table))
>>   (mapcar (lambda (s)
>>          (mapcar (lambda (c) (aset canon c (aref s 0))) s))
>>        eqv-list)
>>   (set-char-table-extra-slot table 1 canon)
>>   (set-char-table-extra-slot table 2 nil)
>>   (set-standard-case-table table))

> This means you cannot search for, say, å, even if you want to find
> only it and not the other "equivalents", right?

Yes, the idea was to consider them as equivalent, so searching for any
element of the set would match any other.

> That's not how Emacs works now wrt letter-case. At the very least,
> this folding of diacriticals should offer the same flexibility, i.e.
> if the user types 'a', she should be able to find all the variants,
> but if she types 'å', should find only that character.

Good idea.

> Also, this doesn't handle decomposed characters, as in 'å'.  So this
> is not really Unicode-compliant, it's a half-measure of sorts.

The above code snippet predates Unicode Emacs, so you cannot expect it
to handle NFC and NFD and other intricacies of Unicode normalisation.
(Also I've never seen anything else than the NFC forms, e.g., for
German umlauts, in the texts that I usually edit.)

BTW, also isearch-forward doesn't match å when searching for å, and
vice versa. So by your above argument, search in Emacs isn't Unicode
compliant anyway. (But not sure if it should be, because I think that
this would break Boyer-Moore.)


reply via email to

[Prev in Thread] Current Thread [Next in Thread]