Re: why is MB_LEN_MAX so large (16) on glibc

bug-gnulib

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: why is MB_LEN_MAX so large (16) on glibc

From:	Eric Blake
Subject:	Re: why is MB_LEN_MAX so large (16) on glibc
Date:	Wed, 13 May 2015 19:29:47 -0600
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0

On 05/13/2015 06:30 PM, Bruno Haible wrote:

> The value of 4 is sufficient to accommodate all stateless encodings in
> use, including UTF-8 (which was restricted from max. 6 to 4 bytes by
> an ISO standard) and GB18030. But it's not necessarily future-proof.
> 
>> I was worried that it implied that wctomb() might convert a wide char to 
>> _multiple_ encoded chars
>> for some character/encoding combinations?

On Cygwin, where wchar_t is 2 bytes, we have the opposite problem - any
character not in the basic plane of Unicode (that is, > 0xffff) requires
two surrogate pair wchar_t to represent a single character; which
violates the POSIX premise that wchar_t holds a character. It makes for
some odd behavior with wctomb() and friends, but it's the best that can
be done.

If the C11 char16_t and char32_t take off (with the according explosion
in function interfaces), then switching the world to char32_t instead of
wchar_t would be the sane approach for dealing with wide characters.
But I don't know if that is likely to happen.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

[Prev in Thread]

Current Thread

[Next in Thread]

Re: why is MB_LEN_MAX so large (16) on glibc, Bruno Haible, 2015/05/13
- Re: why is MB_LEN_MAX so large (16) on glibc, Paul Eggert, 2015/05/13
- Re: why is MB_LEN_MAX so large (16) on glibc, Eric Blake <=
- Re: why is MB_LEN_MAX so large (16) on glibc, Pádraig Brady, 2015/05/14

Prev by Date: Re: why is MB_LEN_MAX so large (16) on glibc
Next by Date: Re: why is MB_LEN_MAX so large (16) on glibc
Previous by thread: Re: why is MB_LEN_MAX so large (16) on glibc
Next by thread: Re: why is MB_LEN_MAX so large (16) on glibc
Index(es):
- Date
- Thread