bug-texinfo
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: c32width gives incorrect return values in C locale


From: Bruno Haible
Subject: Re: c32width gives incorrect return values in C locale
Date: Sat, 11 Nov 2023 21:06:41 +0100

[CCing bug-gnulib]
Gavin Smith wrote:
> > I guess you will need to look at the Unicode characters that you pass to 
> > c32width,
> > and whether you get return values < 1 for some of them.
> 
> It is locale-dependent!
> 
> It looks like c32width is simply being redirected to wcwidth which then
> doesn't work properly with LC_ALL=C.  This is from the gnulib module
> c32width.
> 
> I don't know if there is an easy way to make a self-contained example
> to show the difference, because it needs all the gnulib Makefile machinery,
> but the difference shows up for any non-ASCII character.  If I add a line
> like
> 
>  fprintf (stderr, "width of [%4.0lx] is %d (remaining %s)\n",
>                     (long) wc, width, q);
> 
> in the right place in the code, where width is the result of c32width,
> then the output looks like
> 
> width of [  40] is 1 (remaining @)
> width of [  4f] is 1 (remaining OE )
> width of [  45] is 1 (remaining E )
> width of [ 152] is -1 (remaining Œ)
> width of [  28] is 1 (remaining (Œ)
> 
> for LC_ALL=C, but
> 
> width of [  40] is 1 (remaining @)
> width of [  4f] is 1 (remaining OE )
> width of [  45] is 1 (remaining E )
> width of [ 152] is 1 (remaining Œ)
> width of [  28] is 1 (remaining (Œ)
> 
> otherwise (LC_ALL=en_GB.UTF-8).

Indeed, the c32* functions by design work only on those Unicode characters
that can be represented as multibyte sequences in the current locale.

I'll document this better in the Gnulib manual.

Since you want texinfo to work on UTF-8 encoded text with characters outside
the repertoire of the current locale, you'll need the libunistring functions,
documented in
<https://www.gnu.org/software/libunistring/manual/html_node/uniwidth_002eh.html>.
Namely, replace c32width with uc_width.

Bruno






reply via email to

[Prev in Thread] Current Thread [Next in Thread]