bug-libunistring
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-libunistring] c32width gives incorrect return values in C local


From: Bruno Haible
Subject: Re: [bug-libunistring] c32width gives incorrect return values in C locale
Date: Sat, 11 Nov 2023 23:54:52 +0100

[CCing bug-libunistring]
Gavin Smith wrote:
> I did not understand why uc_width was said to be "locale dependent":
> 
>   "These functions are locale dependent."
> 
> - from 
> <https://www.gnu.org/software/libunistring/manual/html_node/uniwidth_002eh.html#index-uc_005fwidth>.

That's because some Unicode characters have "ambiguous width" — width 1 in
Western locales, width 2 is East Asian locales (for historic and font choice
reasons).

> I also don't understand the purpose of the "encoding" argument -- can this
> always be "UTF-8"?

Yes, it can be always "UTF-8"; then uc_width will always choose width 1 for
these characters.

> I'm also unclear on the exact relationship between the types char32_t,
> ucs4_t and uint32_t.  For example, uc_width takes a ucs4_t argument
> but u8_mbtouc writes to a char32_t variable.  In the code I committed,
> I used a cast to ucs4_t when calling uc_width.

These types are all identical. Therefore you don't even need to cast.

  - char32_t comes from <uchar.h> (ISO C 11 or newer).
  - ucs4_t comes from GNU libunistring.
  - uint32_t comes from <stdint.h>.

Bruno







reply via email to

[Prev in Thread] Current Thread [Next in Thread]