\c escape within $'...' can produce mangled UTF-8

bug-bash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

\c escape within $'...' can produce mangled UTF-8

From:	Dmitry Groshev
Subject:	\c escape within $'...' can produce mangled UTF-8
Date:	Sat, 14 Aug 2010 23:01:50 +0400

Configuration Information [Automatically generated, do not change]:
Machine: i686
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='i686'
-DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='i686-pc-linux-gnu'
-DCONF_VENDOR='pc' -DLOCALEDIR='/usr/local/share/locale'
-DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H   -I.  -I. -I./include
-I./lib   -g -O2
uname output: Linux wjlair 2.6.24.5-smp #1 SMP Fri Aug 14 19:13:09 MSD
2009 i686 AMD Athlon(tm) 64 X2 Dual Core Processor 5200+ AuthenticAMD
GNU/Linux
Machine Type: i686-pc-linux-gnu

Bash Version: 4.1
Patch Level: 0
Release Status: release

Description:
        In UTF-8 locale (such as ru_RU.UTF-8), a \c escape within
$'...' results in an invalid UTF-8 string if followed by an UTF-8
character: ansicstr() in lib/sh/strtrans.c consumes and converts the
character's first byte, leaving the rest of UTF-8 sequence as it were.

Repeat-By:
        echo $'\cА' > utf8bug.txt

        The "А" character in the example is cyrillic - U+0410, UTF-8: 0xD0
0x90 . It gets transformed into 0x10 0x90 which is invalid UTF-8.

-- 
-= With best regards, Dmitry Groshev =-

[Prev in Thread]

Current Thread

[Next in Thread]

\c escape within $'...' can produce mangled UTF-8, Dmitry Groshev <=
- Re: \c escape within $'...' can produce mangled UTF-8, Chet Ramey, 2010/08/14
  - Re: \c escape within $'...' can produce mangled UTF-8, Dmitry Groshev, 2010/08/14
    - Re: \c escape within $'...' can produce mangled UTF-8, Andre Majorel, 2010/08/15
    - Re: \c escape within $'...' can produce mangled UTF-8, Dennis Williamson, 2010/08/15
    - Re: \c escape within $'...' can produce mangled UTF-8, Andreas Schwab, 2010/08/15
    - Re: \c escape within $'...' can produce mangled UTF-8, Dmitry Groshev, 2010/08/15
    - Re: \c escape within $'...' can produce mangled UTF-8, Andreas Schwab, 2010/08/15
    - Re: \c escape within $'...' can produce mangled UTF-8, Mike Frysinger, 2010/08/15
    - Re: \c escape within $'...' can produce mangled UTF-8, Dmitry Groshev, 2010/08/15
    - Re: \c escape within $'...' can produce mangled UTF-8, Mike Frysinger, 2010/08/15

Prev by Date: Re: history -a
Next by Date: Re: \c escape within $'...' can produce mangled UTF-8
Previous by thread: bash segfault on unsetting bad associative array
Next by thread: Re: \c escape within $'...' can produce mangled UTF-8
Index(es):
- Date
- Thread