[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Unicode confusables and reordering characters considered harmful, a
From: |
Gregory Heytings |
Subject: |
Re: Unicode confusables and reordering characters considered harmful, a simple solution |
Date: |
Mon, 08 Nov 2021 19:58:56 +0000 |
In fact, it did not take me much time to create a case that your
algorithm doesn't detect (and AFAIU cannot detect without also
displaying warnings about many legitimate uses). I attach the example
code, how that code is displayed by Emacs, and how that code would be
displayed with the patch I proposed.
Thanks, I've now enhanced the code which detects suspiciously reordered
source to cover this kind of cases as well. I didn't see any legitimate
uses flagged after the change, but if you can find any such cases,
please show them and I will take a look.
Clearly, you failed to understand the meaning of my post. It did *not*
mean:
Your algorithm could be improved.
It meant:
Your algorithm cannot be trusted.
It took less than 24 hours (after your commit) to a non-malevolent actor
to find a way to escape the detection algorithm you implemented and which
you claimed was the proper solution to the problem pointed to by the
"Trojan Source" paper. Your slightly improved algorithm will evidently
not resist longer if an actually malevolent actor tries to find a way to
escape it (and of course they won't tell you when and how they did it).
So I'll say it one more time:
The only proper solution to that problem is to highlight, by default,
these control characters in prog-mode and its descendants. That's the
only 100% foolproof solution that guarantees that such constructs will
never be missed, and this is what about 99.99% Emacs users need. The
remaining 0.01% are those who:
1. Use RTL languages in their source code, AND
2. Use these reordering control characters in their source code, AND
3. Would find such highlighted characters annoying.
Those few users can turn that highlighting option off, either globally or
by turning the minor mode off in this or that buffer.
The right balance is where the percent of false positives is very low.
IMO, that's not the right balance: the right balance is where the
percentage of false negatives is zero.
If you need zero false negatives, and don't care about the level of
noise (i.e. false positives), you have the features for that already:
customize glyphless-char-display-control to show the control characters
as acronyms or hex codes.
Again you clearly fail to understand what I said. The problem has nothing
to do with me, the problem is, as the "Trojan Source" paper rightly
explains, what the default settings of various available editors are.
Claiming that asking every Emacs user (except the few users mentioned
above) to set an obscure configuration option (which is only mentioned
once, in passing, in the manual) is a solution to that problem is just
wrong.
Anyway, it's now clear that this problem will remain unfixed in Emacs.
Given this, I can only applaud the Rust developers when they took the
decision to ban these control characters from Rust code files. If editors
cannot be trusted to do a proper job on this matter, compilers should do
it, and I hope that a similar solution will soon be adopted in other
compilers.
And I leave this discussion with this post.
- Re: Unicode confusables and reordering characters considered harmful, a simple solution, (continued)
- Re: Unicode confusables and reordering characters considered harmful, a simple solution, Stefan Kangas, 2021/11/05
- Re: Unicode confusables and reordering characters considered harmful, a simple solution, Eli Zaretskii, 2021/11/05
- Re: Unicode confusables and reordering characters considered harmful, a simple solution, Stefan Kangas, 2021/11/05
- Re: Unicode confusables and reordering characters considered harmful, a simple solution, Eli Zaretskii, 2021/11/05
- Re: Unicode confusables and reordering characters considered harmful, a simple solution, Stefan Kangas, 2021/11/05
- Re: Unicode confusables and reordering characters considered harmful, a simple solution, Eli Zaretskii, 2021/11/05
- Re: Unicode confusables and reordering characters considered harmful, a simple solution, Gregory Heytings, 2021/11/05
- Re: Unicode confusables and reordering characters considered harmful, a simple solution, Daniel Brooks, 2021/11/05
- Re: Unicode confusables and reordering characters considered harmful, a simple solution, Eli Zaretskii, 2021/11/06
- Re: Unicode confusables and reordering characters considered harmful, a simple solution, Eli Zaretskii, 2021/11/06
- Re: Unicode confusables and reordering characters considered harmful, a simple solution,
Gregory Heytings <=
- Re: Unicode confusables and reordering characters considered harmful, a simple solution, Eli Zaretskii, 2021/11/08
- Re: Unicode confusables and reordering characters considered harmful, a simple solution, Stefan Monnier, 2021/11/08
- Re: Unicode confusables and reordering characters considered harmful, a simple solution, Eli Zaretskii, 2021/11/08
- Re: Unicode confusables and reordering characters considered harmful, a simple solution, Benjamin Riefenstahl, 2021/11/06
- Re: Unicode confusables and reordering characters considered harmful, a simple solution, Eli Zaretskii, 2021/11/06
- Re: Unicode confusables and reordering characters considered harmful, a simple solution, Benjamin Riefenstahl, 2021/11/06
- Re: Unicode confusables and reordering characters considered harmful, a simple solution, Eli Zaretskii, 2021/11/06
- Re: Unicode confusables and reordering characters considered harmful, a simple solution, tomas, 2021/11/05
- Re: Unicode confusables and reordering characters considered harmful, a simple solution, Daniel Brooks, 2021/11/05
- Re: Unicode confusables and reordering characters considered harmful, a simple solution, Eli Zaretskii, 2021/11/05