[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PATCH 0/9] Fix reports
From: |
Akim Demaille |
Subject: |
[PATCH 0/9] Fix reports |
Date: |
Sat, 13 Jun 2020 17:23:08 +0200 |
I was working on integrating Vincent's counterexamples into the
reports when I noticed that our reports were truly ugly when UTF-8 was
used in string aliases.
I have always had problems with the way bison escapes the user's
strings (which resulted in people not being able to trust the aliases,
something which error=detailed fixed in 3.6), but that was really yet
another sign that
we should not toy with the user's spelling of her strings
So this batch revises the way Bison parses strings. Instead of
interpreting them (i.e., resolving the escapes), it keeps them the way
they are, but it does check for their validity (so it will reject \777
for instance).
Only when the string must be interpreted (e.g., for %output) do we
unquote the strings.
The news entry reads:
*** String aliases are faithfully propagated
Bison used to interpret user strings (i.e., decoding backslash escapes)
when reading them, and to escape them (i.e., issue non-printable
characters as backslash escapes, taking the locale into account) when
outputing them. As a consequence non-ASCII strings (say in UTF-8) ended
up "ciphered" as sequences of backslash escapes. This happened not only
in the generated sources (where the compiler will reinterpret them), but
also in all the generated reports (text, xml, html, dot, etc.). Reports
were therefore not readable when string aliases were not pure ASCII.
Worse yet: the output depended on the user's locale.
Now Bison faithfully treats the string aliases exactly the way the user
spelled them. This fixes all the aforementioned problems. However, now,
string aliases semantically equivalent but syntactically different (e.g.,
"A", "\x41", "\101") are considered to be different.
Cheers!
Akim Demaille (9):
style: prefer 'FOO ()' to 'FOO' for function-like macros
style: reduce scopes
style: introduce & use STRING_1GROW
style: factor common bits about string scanning
tests: check reports with conflicts and UTF-8
parser: keep string aliases as the user wrote it
regen
reports: don't escape the labels
reports: the column width differs from the byte count
NEWS | 17 ++
src/flex-scanner.h | 17 +-
src/graphviz.c | 22 +-
src/graphviz.h | 6 -
src/main.c | 8 +-
src/muscle-tab.c | 3 +-
src/parse-gram.c | 212 ++++++++++++++----
src/parse-gram.h | 9 +-
src/parse-gram.y | 193 ++++++++++++++---
src/print-graph.c | 19 +-
src/print.c | 13 +-
src/reader.c | 1 +
src/scan-code.l | 26 +--
src/scan-gram.l | 122 +++++------
src/scan-skel.l | 8 +-
src/system.h | 17 ++
tests/input.at | 4 +-
tests/regression.at | 4 +-
tests/report.at | 515 ++++++++++++++++++++++++++++++++++++++++++++
19 files changed, 1023 insertions(+), 193 deletions(-)
--
2.27.0
- [PATCH 0/9] Fix reports,
Akim Demaille <=
- [PATCH 1/9] style: prefer 'FOO ()' to 'FOO' for function-like macros, Akim Demaille, 2020/06/13
- [PATCH 2/9] style: reduce scopes, Akim Demaille, 2020/06/13
- [PATCH 3/9] style: introduce & use STRING_1GROW, Akim Demaille, 2020/06/13
- [PATCH 4/9] style: factor common bits about string scanning, Akim Demaille, 2020/06/13
- [PATCH 5/9] tests: check reports with conflicts and UTF-8, Akim Demaille, 2020/06/13
- [PATCH 6/9] parser: keep string aliases as the user wrote it, Akim Demaille, 2020/06/13
- [PATCH 7/9] regen, Akim Demaille, 2020/06/13
- [PATCH 8/9] reports: don't escape the labels, Akim Demaille, 2020/06/13
- [PATCH 9/9] reports: the column width differs from the byte count, Akim Demaille, 2020/06/13