bison-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH 0/9] Fix reports


From: Akim Demaille
Subject: [PATCH 0/9] Fix reports
Date: Sat, 13 Jun 2020 17:23:08 +0200

I was working on integrating Vincent's counterexamples into the
reports when I noticed that our reports were truly ugly when UTF-8 was
used in string aliases.

I have always had problems with the way bison escapes the user's
strings (which resulted in people not being able to trust the aliases,
something which error=detailed fixed in 3.6), but that was really yet
another sign that

  we should not toy with the user's spelling of her strings

So this batch revises the way Bison parses strings.  Instead of
interpreting them (i.e., resolving the escapes), it keeps them the way
they are, but it does check for their validity (so it will reject \777
for instance).

Only when the string must be interpreted (e.g., for %output) do we
unquote the strings.

The news entry reads:

*** String aliases are faithfully propagated

  Bison used to interpret user strings (i.e., decoding backslash escapes)
  when reading them, and to escape them (i.e., issue non-printable
  characters as backslash escapes, taking the locale into account) when
  outputing them.  As a consequence non-ASCII strings (say in UTF-8) ended
  up "ciphered" as sequences of backslash escapes.  This happened not only
  in the generated sources (where the compiler will reinterpret them), but
  also in all the generated reports (text, xml, html, dot, etc.).  Reports
  were therefore not readable when string aliases were not pure ASCII.
  Worse yet: the output depended on the user's locale.

  Now Bison faithfully treats the string aliases exactly the way the user
  spelled them.  This fixes all the aforementioned problems.  However, now,
  string aliases semantically equivalent but syntactically different (e.g.,
  "A", "\x41", "\101") are considered to be different.


Cheers!

Akim Demaille (9):
  style: prefer 'FOO ()' to 'FOO' for function-like macros
  style: reduce scopes
  style: introduce & use STRING_1GROW
  style: factor common bits about string scanning
  tests: check reports with conflicts and UTF-8
  parser: keep string aliases as the user wrote it
  regen
  reports: don't escape the labels
  reports: the column width differs from the byte count

 NEWS                |  17 ++
 src/flex-scanner.h  |  17 +-
 src/graphviz.c      |  22 +-
 src/graphviz.h      |   6 -
 src/main.c          |   8 +-
 src/muscle-tab.c    |   3 +-
 src/parse-gram.c    | 212 ++++++++++++++----
 src/parse-gram.h    |   9 +-
 src/parse-gram.y    | 193 ++++++++++++++---
 src/print-graph.c   |  19 +-
 src/print.c         |  13 +-
 src/reader.c        |   1 +
 src/scan-code.l     |  26 +--
 src/scan-gram.l     | 122 +++++------
 src/scan-skel.l     |   8 +-
 src/system.h        |  17 ++
 tests/input.at      |   4 +-
 tests/regression.at |   4 +-
 tests/report.at     | 515 ++++++++++++++++++++++++++++++++++++++++++++
 19 files changed, 1023 insertions(+), 193 deletions(-)

-- 
2.27.0




reply via email to

[Prev in Thread] Current Thread [Next in Thread]