[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Patch for unicode in varnames...

From: Chet Ramey
Subject: Re: Patch for unicode in varnames...
Date: Tue, 13 Jun 2017 15:46:25 -0400
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Thunderbird/52.1.1

On 6/5/17 8:40 PM, Peter & Kelly Passchier wrote:
> On 06/06/2560 05:39, George wrote:
>> So if you had "Pokémon" as an identifier in a Latin-1-encoded script (byte 
>> value 0xE9 between the "k" and "m") and then tried running that script in a
>> UTF-8 locale, that byte sequence (0xE9 0x6D) would actually be invalid in 
>> UTF-8, so Eduardo's patch would indicate that the identifier is invalid and
>> fail to run the script.
> I often work with a locale that has a UTF-8 encoding and an
> different/older encoding that are incompatible. I haven't tried the
> patch, but if I use unicode characters in function names, if I write a
> script in one encoding, and run it in an environment in the other
> encoding, it still runs correctly, but it won't render correctly. 

This can lead to subtle failures. If a variable name uses a character that
is an alphanumeric in the writer's locale, but not the default locale
where it's executed, the writer has to set the locale explicitly to avoid
the variable causing a `not a valid identifier' error.

``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    address@hidden    http://cnswww.cns.cwru.edu/~chet/

reply via email to

[Prev in Thread] Current Thread [Next in Thread]