emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Can watermarking Unicode text using invisible differences sneak thro


From: Richard Stallman
Subject: Re: Can watermarking Unicode text using invisible differences sneak through Emacs, or can Emacs detect it?
Date: Mon, 07 Feb 2022 00:11:28 -0500

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > So you mean we should create a database of ASCII characters that
  > approximate the combining diacriticals?  But if so, how is it better
  > than having a database of complete characters and their ASCII
  > equivalents, like we have now in latin1-disp.el?

I think there are only around 20 diacritics.  There must be hundreds
of letters-with-diacritics.  The method I've proposed can handle
everything automatically, given a table about the 20-odd diacritics.
That's a great simplification from a table of hundreds of elements,
set up by hand.

  >  but a database of complete characters makes it easier to
  > make sure the results are optimal, because you see the original
  > complete character and the complete equivalent,

I don't follow you here.  In particular, what does "complete
equivalent" mean?  Concretely how would a result be "less than
optimal"?  Can you illustrate with an example?

  > I think reasonable appearance is more important than memory
  > consumption in this case,

What makes an appearance more or less reasonable when we're talking
about replacing one character with two or three that express
_symbolically_ which character it is?  I don't get it.

  > You can use ucs-normalize-NFKD-string for the job of
  > ucs-normalize-NFD-string as well:

  >   (append (ucs-normalize-NFKD-string "ã") nil) => (97 771)

Great!  That does most of the job, I think.

  > (I used 'append' here to make it evident that the result of the
  > decomposition is 2 characters, not one, since the Emacs display will
  > by default combine them into the same glyph as the original non-ASCII
  > character,

Not on a Linux console, I think.  When I have f and i in the buffer,
Emacs does not convert them into a ligature.  The only time it has to
try to deal with a ligature is when there is a Unicode ligature
code point in the buffer.

-- 
Dr Richard Stallman (https://stallman.org)
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)





reply via email to

[Prev in Thread] Current Thread [Next in Thread]