[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: bug with case conversion of UTF-8 characters
From: |
Chet Ramey |
Subject: |
Re: bug with case conversion of UTF-8 characters |
Date: |
Sun, 25 Jan 2015 20:26:14 -0500 |
User-agent: |
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 |
On 1/22/15 9:43 AM, Stephane Chazelas wrote:
> Bash Version: 4.3
> Patch Level: 30
> Release Status: release
>
> (Debian unstable amd64)
>
> $ LC_ALL=tr_TR.UTF-8 bash -c 'typeset -l a; a=İ; echo $a' | hd
> 00000000 69 b0 0a |i..|
> 00000003
> $ a=İ LC_ALL=tr_TR.UTF-8 bash -c 'echo ${a,,}' | hd
> 00000000 69 b0 0a |i..|
> 00000003
>
> In Turkish locales on a GNU system at least, uppercase i is İ,
> not I. And lowercase I is ı, not i.
Thanks for the report, especially the example that showed bash's assumption
that the lowercase and uppercase versions of a letter have the same width.
I would not count on the above fact about Turkish locales being true across
all systems; it's not true on Mac OS X, for instance.
Chet
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRU chet@case.edu http://cnswww.cns.cwru.edu/~chet/