emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Can watermarking Unicode text using invisible differences sneak thro


From: Eli Zaretskii
Subject: Re: Can watermarking Unicode text using invisible differences sneak through Emacs, or can Emacs detect it?
Date: Fri, 04 Feb 2022 10:03:08 +0200

> From: Richard Stallman <rms@gnu.org>
> Cc: eliz@gnu.org, psainty@orcon.net.nz, luangruo@yahoo.com,
>       emacs-devel@gnu.org, kevin.legouguec@gmail.com
> Date: Thu, 03 Feb 2022 22:52:07 -0500
> 
> It would be useful to be able to analyze and construct complex
> characters -- for instance, to operate on a-with-breve-and-tilde
> and find out that represents an a with two diacritics.

This already exists, see below.  But you seem to have something
different in mind:

> So I propose a function, `diacriticize'.  Its arguments are
> characters, and if they can be graphically combined to make a single
> character, that's what diacriticize returns.  Otherwise, it returns
> nil.
> 
>   (diacriticize ?a ?~ ?˘) => ?ã¯
>   (diacriticize ?a ?Z) => nil
> 
> It could have an inverse function, criticanalyze, which given the
> character code for a character that is (in spirit) a composition,
> would return the characters it consists of:
> 
> (criticanalyze ?ã˘) => (?a ?~ ?˘)
> 
> With these functions, latin1-display could figure out automatically
> which conversions to make.

I don't understand the specification of these functions.  How would
diacriticize decide/know that ?~ is equivalent to the ?̃ (U+0303
COMBINING TILDE) that is part of ?ã ?  We do have infrastructure in
place to decompose characters like ã into the base character ?a and
the combining diacritic(s): the call (ucs-normalize-NFD-string "ã")
returns a string of 2 characters, ?a and ?̃.  But how do you propose
to make the leap from ?̃ to ?~ ?



reply via email to

[Prev in Thread] Current Thread [Next in Thread]