bug-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

portability issues with unicodeio (was: [GNU Bison 3.6.90] testsuite: 17


From: Akim Demaille
Subject: portability issues with unicodeio (was: [GNU Bison 3.6.90] testsuite: 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 196 220 221 228 244 245 246 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 549 555 562 567 577 failed)
Date: Wed, 8 Jul 2020 06:45:35 +0200

Hi!

Bison uses gnulib's unicodeio module to emit bullets (•) portably,
with a fallback to '.'.  It's implemented this way (src/gram.h):

> /* Fallback in case we can't print "•".  */
> static inline long
> print_dot_fallback (unsigned int code _GL_UNUSED,
>                     const char *msg _GL_UNUSED,
>                     void *callback_arg)
> {
>   FILE *out = (FILE *) callback_arg;
>   putc ('.', out);
>   return -1;
> }
> 
> /* Print "•", the symbol used to represent a point in an item (aka, a
>    dotted rule).  */
> static inline void
> print_dot (FILE *out)
> {
>   unicode_to_mb (0x2022, fwrite_success_callback, print_dot_fallback, out);
> }

Unfortunately on Kiyoshi's environment (SunOS hidden 5.11 11.3 i86pc i386 i86pc,
GCC 9.3.0) we get '?' instead of '.' in the C locale.  We get a genuine ASCII
'?', it's not some fallback from the terminal which fails to display the
character.  And we properly get the bullet with en_US.UTF-8.

Kiyoshi can reproduce the problem with GNU Coreutils' printf, where he
get's a '?', although the fallback display the escape sequence (i.e.,
it should repeat '\u2022'):

> /* Simple failure callback that displays a fallback representation in plain
>    ASCII, using the same notation as ISO C99 strings.  */
> static long
> fallback_failure_callback (unsigned int code,
>                            const char *msg _GL_UNUSED,
>                            void *callback_arg)
> {
>   FILE *stream = (FILE *) callback_arg;
> 
>   if (code < 0x10000)
>     fprintf (stream, "\\u%04X", code);
>   else
>     fprintf (stream, "\\U%08X", code);
>   return -1;
> }
> 
> /* Outputs the Unicode character CODE to the output stream STREAM.
>    Upon failure, exit if exit_on_error is true, otherwise output a fallback
>    notation.  */
> void
> print_unicode_char (FILE *stream, unsigned int code, int exit_on_error)
> {
>   unicode_to_mb (code, fwrite_success_callback,
>                  exit_on_error
>                  ? exit_failure_callback
>                  : fallback_failure_callback,
>                  stream);
> }



Kiyoshi's messages start here:

https://lists.gnu.org/r/bug-bison/2020-07/msg00001.html

The latest:

> Le 6 juil. 2020 à 22:35, Kiyoshi KANAZAWA <yoi_no_myoujou@yahoo.co.jp> a 
> écrit :
> 
> Hi Akim,
> 
> $ LC_ALL=C $coreutilsbin/printf '\u2022\n' | od -t x1
> 0000000 3f 0a
> 0000002
> 
> $ LC_ALL=en_US.UTF-8 $coreutilsbin/printf '\u2022\n' | od -t x1
> 0000000 e2 80 a2 0a
> 0000004
> 
> 
> FYI, I have very limited locale.
> $ locale -a
> C
> POSIX
> en_US.ISO8859-1
> en_US.ISO8859-15
> en_US.ISO8859-15@euro
> en_US.UTF-8
> ja_JP.PCK
> ja_JP.UTF-8
> ja_JP.UTF-8@cldr
> ja_JP.eucJP

I'm unsure what the next steps would be from here.

Thanks in advance!


reply via email to

[Prev in Thread] Current Thread [Next in Thread]