bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Bash doesn't handle C with acute accent properly during readline's r


From: Chet Ramey
Subject: Re: Bash doesn't handle C with acute accent properly during readline's rl_change_case
Date: Fri, 12 May 2017 11:15:34 -0400
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Thunderbird/52.1.0

On 5/11/17 8:56 AM, Eduardo Bustamante wrote:
> The C with acute accent character: https://en.wikipedia.org/wiki/%C4%86
> 
> - Upper case
> dualbus@debian:~$ printf '\U0106\n'
> Ć
> 
> - Lower case
> dualbus@debian:~$ printf '\U0107\n'
> ć
> 
> Now, in bash, if you type in ć, then run readline `upcase-word' on it,
> instead of ending up with the UTF-8 multibyte string for U+0106 (0xC4
> 0x86), you end up with 0x07 0x87.
> 
> The parameter expansion doesn't seem to have that problem so I think
> it's a bug in readline:

Thanks for the report. This is a bug in readline.

> For some reason, rl_change_case thinks `c` is ASCII:
> 
> (gdb) call isascii((unsigned char)c)
> $8 = 1

Because when you cast it to unsigned char, it masks all but the least
significant 8 bits, which results in a valid ascii character.


-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    chet@case.edu    http://cnswww.cns.cwru.edu/~chet/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]