[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug #63334] \[u....] syntax for ASCII characters handled inconsistently
From: |
Dave |
Subject: |
[bug #63334] \[u....] syntax for ASCII characters handled inconsistently |
Date: |
Tue, 8 Nov 2022 04:34:05 -0500 (EST) |
URL:
<https://savannah.gnu.org/bugs/?63334>
Summary: \[u....] syntax for ASCII characters handled
inconsistently
Project: GNU troff
Submitter: barx
Submitted: Tue 08 Nov 2022 03:34:02 AM CST
Category: Core
Severity: 2 - Minor
Item Group: Warning/Suspicious behaviour
Status: None
Privacy: Public
Assigned to: None
Open/Closed: Open
Discussion Lock: Any
Planned Release: None
_______________________________________________________
Follow-up Comments:
-------------------------------------------------------
Date: Tue 08 Nov 2022 03:34:02 AM CST By: Dave <barx>
I see the same behavior in groff 1.22.4 and in the latest git code. (And for
that matter, going all the way back to at least 1.19.2.)
ASCII characters represented in \[u....] form are handled inconsistently. A
simple demonstration of the difference:
$ echo '\[u0021]\[u0022]' | nroff | cat -s
troff: <standard input>:1: warning: can't find special character '\!'
"
\[u0022] is correctly converted to, and output as, a quotation mark. But
\[u0021], rather than being converted to a "!", is for some reason converted
to the sequence "\!", which (unsurprisingly) is not a recognized character.
It's not clear to me what internal mechanism might cause this: if "\[u0021]"
were parsed as a backslash followed by "[u0021]", the bracketed sequence
wouldn't be specially interpreted at all.
Looking at all the pre-alphabet ASCII symbols:
$ printf "\\[u%04x] " $(seq 32 64) | nroff | cat -s
Five of them are handled as expected, 15 are converted to unrecognized \
characters, and 13 are not recognized at all.
That last case I don't consider a bug, since (current) groff does not specify
that any of them should be recognized. (The 1.22.4 groff_char(7) page sort of
gave the impression that some of them would be, but these sequences have been
removed from the drastically rewritten 1.23 groff_char(7).) Arguably, none of
this is a bug, since no documentation explicitly states that, for example,
"\[u0021]" will be recognized as "!". But the way it _is_ handled is
surprising enough that I wanted to at least bring it to the development team's
attention.
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?63334>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [bug #63334] \[u....] syntax for ASCII characters handled inconsistently,
Dave <=