bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bug with case conversion of UTF-8 characters


From: Stephane Chazelas
Subject: Re: bug with case conversion of UTF-8 characters
Date: Mon, 2 Oct 2017 17:49:46 +0100
User-agent: Mutt/1.5.24 (2015-08-30)

2015-01-22 14:43:00 +0000, Stephane Chazelas:
[...]
> Bash Version: 4.3
> Patch Level: 30
> Release Status: release
> 
> (Debian unstable amd64)
> 
> $ LC_ALL=tr_TR.UTF-8 bash -c 'typeset -l a; a=İ; echo $a' | hd
> 00000000  69 b0 0a                                          |i..|
> 00000003
[...]

Hi. While, that particular bug seems to have been fixed in 4.4,
it looks like there's still a problem in those Turkish locales
where uppercase i is İ and lowercase I is ı.

$ X=AEIOU LC_ALL=tr_TR.UTF-8 bash -c 'echo "${X,,}"'
aeIou
$ X=aeiou LC_ALL=tr_TR.UTF-8 bash -c 'echo "${X^^}"'
AEiOU

same issue with typeset -l/u

$ X=aeiou LC_ALL=tr_TR.UTF-8 awk 'BEGIN{print toupper(ENVIRON["X"])}'
AEİOU
$ X=AEIOU LC_ALL=tr_TR.UTF-8 awk 'BEGIN{print tolower(ENVIRON["X"])}'
aeıou

Those ones are OK:

$ X=AEİOU LC_ALL=tr_TR.UTF-8 bash -c 'echo "${X,,}"'
aeiou
$ X=aeıou LC_ALL=tr_TR.UTF-8 bash -c 'echo "${X^^}"'
AEIOU

nocasematch seems to be OK as well.

$ bash --version
GNU bash, version 4.4.12(1)-release (x86_64-pc-linux-gnu)

(on Debian).

-- 
Stephane



reply via email to

[Prev in Thread] Current Thread [Next in Thread]