Re: iswprint() and wcwidth() don't work properly on some platforms with

bug-gnulib

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: iswprint() and wcwidth() don't work properly on some platforms with

From:	Eric Blake
Subject:	Re: iswprint() and wcwidth() don't work properly on some platforms with certain unicodes
Date:	Tue, 4 Sep 2018 11:58:52 -0500
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1

On 08/31/2018 12:42 PM, Bruno Haible wrote:

Yes. This particular character (U+1F600) was added in Unicode 6.1 [1][2].

The iswprint() function is implemented in the libc, which is why you see
differences across platforms. After a new Unicode release is made, it
takes some time until the picks it up.

Then, it takes some time until the distros pick up the new glibc release.

Fedora 28 uses glibc 2.27, released in 2018.
CentOS 7, like RHEL 7, uses glibc 2.17, released in 2012.

gnulib adds basic Unicode support when that is missing from the platforms
(e.g. wcwidth(0x3000)), but we don't make an effort to support the most
recent Unicode standards, because that would be a lot of work for something
that the platforms themselves will be doing.


Indeed, whereas Unicode 11.0.0 is now released,...

Also when these unicodes are throw at
wcwidth(), it returns incorrect width for these unicodes, but it might
be because of the fact that these unicodes are considered unprintable


Yes, wcwidth relies on iswprint.

To get the behaviour you want, you may try to force the wcwidth replacement
which is based on (currently) Unicode 9.0.0. To do so, set the environment

...gnulib being at 9.0.0 can actually result in regressions if gnulibreplaces a libc function merely for being at a different version of Unicode.

variable
   gl_cv_func_wcwidth_works=no
at configure time.

Yes, that works for a one-time per-machine override, for testing ifusing gnulib-provided replacements (that force a particular Unicodeversion, which may be newer or older than the libc's version) behavesanely across multiple platforms. But it is not a wise idea to codifythat into libvirt's configure.ac (or any other project).

Rather, if libvirt is hitting test failures due solely to the differenceof Unicode version that the underlying libc complies with, it might bebetter to rewrite the failing tests to instead use different Unicodecharacters that were available since the oldest supported version ofUnicode across any platform being targetted by libvirt, instead oftesting the behavior of problematic characters that were only recentlyadded in newer Unicode.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

[Prev in Thread]

Current Thread

[Next in Thread]

Re: iswprint() and wcwidth() don't work properly on some platforms with certain unicodes, Eric Blake <=
- Re: iswprint() and wcwidth() don't work properly on some platforms with certain unicodes, Bruno Haible, 2018/09/04

Prev by Date: Re: Regression: "wchar: fix bug when checking for ‘inline’" breaks distcc usage
Next by Date: Re: iswprint() and wcwidth() don't work properly on some platforms with certain unicodes
Previous by thread: Regression: "wchar: fix bug when checking for ‘inline’" breaks distcc usage
Next by thread: Re: iswprint() and wcwidth() don't work properly on some platforms with certain unicodes
Index(es):
- Date
- Thread