[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [SCM] GNU Autoconf source repository branch, master, updated. v2.65-

From: Eric Blake
Subject: Re: [SCM] GNU Autoconf source repository branch, master, updated. v2.65-35-ga2889ee
Date: Wed, 03 Feb 2010 06:08:18 -0700
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv: Gecko/20090812 Thunderbird/ Mnenhy/

According to Paolo Bonzini on 2/3/2010 2:48 AM:
> On 02/01/2010 07:44 PM, Ralf Wildenhues wrote:
>> On my Cygwin, C means UTF-8. I suppose though that's just because
>> Cygwin changed recently.
> That's a royally bad idea, since the only portable way to handle files
> that potentially contain invalid multibyte sequences, is to set the
> locale to C.

There was a HUGE thread on this topic on both the cygwin and Austin Group
mailing lists, which I don't want to repeat here.  If you want to complain
about cygwin's choice of locale, take it to the cygwin list.  That said:

Cygwin 1.7.1 defaults to C.UTF-8 in the absence of a specific request, and
treats C like C.UTF-8, but you can select a unibyte locale with C.ASCII.

The upcoming Cygwin 1.7.2 will continue to default to C.UTF-8, but will
treat C like C.ASCII.

POSIX states that the C locale can be used in any byte context for all 256
bytes.  But for character contexts, it can only portably used for
characters < 128.  Cygwin satisfies these rules, whether you use C.UTF-8
or C.ASCII (and therefore, whether you use C in cygwin 1.7.1 or C in
cygwin 1.7.2).  And any program that depends on the "C" locale providing
strictly unibyte encoding of characters is broken, per POSIX, so in a way,
cygwin 1.7.1 is doing a favor at helping root out non-portable programs.

Don't work too hard, make some time for fun as well!

Eric Blake             address@hidden

Attachment: signature.asc
Description: OpenPGP digital signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]