[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bash-4.3: casemod word expansions broken with UTF-8

From: Ulrich Mueller
Subject: bash-4.3: casemod word expansions broken with UTF-8
Date: Sun, 15 Nov 2015 14:46:51 +0100

Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: x86_64-pc-linux-gnu-gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' 
-DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu' 
-DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL 
-DHAVE_CONFIG_H   -I. -I./include -I. -I./include -I./lib  
-DSYS_BASHRC='/etc/bash/bashrc' -DSYS_BASH_LOGOUT='/etc/bash/bash_logout' 
uname output: Linux juno 3.18.24-gentoo #1 SMP Sun Nov 8 10:43:05 CET 2015 
x86_64 Intel(R) Core(TM)2 Duo CPU T6570 @ 2.10GHz GenuineIntel GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 4.3
Patch Level: 42
Release Status: release

        In an UTF-8 locale like en_US.UTF-8, the case-modifying
        parameter expansions sometimes return invalid UTF-8 encodings.

        This seems to happen when the UTF-8 byte sequences that are
        encoding upper and lower case have different lengths.

        $ LC_ALL=en_US.UTF-8
        $ x=$'\xc4\xb1' # LATIN SMALL LETTER DOTLESS I
        $ echo -n "${x^}" | od -t x1
        0000000 49 b1

        This should have output "49" for "I" only. The "b1" is illegal
        as the first byte of an UTF-8 sequence.

        $ x=$'\xe1\xba\x9e' # LATIN CAPITAL LETTER SHARP S
        $ echo -n "${x,}" | od -t x1
        0000000 c3 9f 9e

        This should have output "c3 9f" (for "sharp s") only.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]