bug#51733: 27.1; Detect impossible email addresses better

bug-gnu-emacs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#51733: 27.1; Detect impossible email addresses better

From:	Lars Ingebrigtsen
Subject:	bug#51733: 27.1; Detect impossible email addresses better
Date:	Mon, 17 Jan 2022 21:22:58 +0100
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux)

I'm not quite sure I understand this bit here
https://www.unicode.org/reports/tr39/#Confusable_Detection

---
For an input string X, define skeleton(X) to be the following transformation on 
the string:

    Convert X to NFD format, as described in [UAX15].
    Concatenate the prototypes for each character in X according to the 
specified data, producing a string of exemplar characters.
    Reapply NFD.
---

I mean, that sounds OK in and of itself, but then:

---
 X and Y are single-script confusables if and only if they are confusable, and 
their resolved script sets have at least one element in common.

    Examples: “ǉeto” and “ljeto” in Latin (the Croatian word for “summer”), 
where the first word uses only four codepoints, the first of which is U+01C9 
(ǉ) LATIN SMALL LETTER LJ.
---

But:

(ucs-normalize-NFD-string "ǉeto")
=> "ǉeto"

So according to that algo "ǉeto" and "ljeto" are not confusable.

But if we use NFKD instead, they are:

(ucs-normalize-NFKD-string "ǉeto")
=> "ljeto"

It seems unlikely to be a typo in this document, surely?  But NFKD seems
to make a whole lot more sense than NFD for this usage.  I must be
missing or misreading something.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

[Prev in Thread]

Current Thread

[Next in Thread]

bug#51733: 27.1; Detect impossible email addresses better, (continued)

Prev by Date: bug#30725: eshell: built-ins do not handle command substitution
Next by Date: bug#53232: close 53232
Previous by thread: bug#51733: 27.1; Detect impossible email addresses better
Next by thread: bug#51733: 27.1; Detect impossible email addresses better
Index(es):
- Date
- Thread