bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bug with case conversion of UTF-8 characters


From: Chet Ramey
Subject: Re: bug with case conversion of UTF-8 characters
Date: Sun, 25 Jan 2015 20:26:14 -0500
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.2.0

On 1/22/15 9:43 AM, Stephane Chazelas wrote:

> Bash Version: 4.3
> Patch Level: 30
> Release Status: release
> 
> (Debian unstable amd64)
> 
> $ LC_ALL=tr_TR.UTF-8 bash -c 'typeset -l a; a=İ; echo $a' | hd
> 00000000  69 b0 0a                                          |i..|
> 00000003
> $ a=İ LC_ALL=tr_TR.UTF-8 bash -c 'echo ${a,,}' | hd
> 00000000  69 b0 0a                                          |i..|
> 00000003
> 
> In Turkish locales on a GNU system at least, uppercase i is İ,
> not I. And lowercase I is ı, not i.

Thanks for the report, especially the example that showed bash's assumption
that the lowercase and uppercase versions of a letter have the same width.
I would not count on the above fact about Turkish locales being true across
all systems; it's not true on Mac OS X, for instance.

Chet
-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRU    chet@case.edu    http://cnswww.cns.cwru.edu/~chet/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]