[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Bash vi mode's e command (end of word) goes to eol when hitting a un
Re: Bash vi mode's e command (end of word) goes to eol when hitting a unicode character
Tue, 4 Sep 2018 09:28:33 -0400
On Mon, Sep 03, 2018 at 01:13:03PM +0200, Enrico Maria De Angelis wrote:
> The version number of bash: GNU bash, version 4.4.23(1)-release
> The hardware and operating system: Arch LInux (constatly update)
> The compiler used to compile: I didn't compile bash myself
> A description of the bug behaviour: & A short script or `recipe' which
> exercises the bug:
> While vi-editing a line like the following
> $ ls bulk32³ grids.dat COPYING
> with the cursor in normal mode at the beginning of the line, hitting e
> repeatedly, cause the cursor to move in order to
> s of ls (correct)
> 2 of bulk32³ (correct, since Vim itself works like this, with an end of
> word being detected in between 2 and ³)
> end of line (wrong)
I can confirm this in Debian's bash 4.4.12 and in bash 5.0-alpha. It's
actually worse than Enrico reports.
First, the cursor doesn't actually move to the end-of-line character
('G'). The cursor moves one space *past* that.
Once there, pressing either 'h' or 'b' moves the cursor from end-of-line
back to the ³ character. That's fairly odd on its, own, but it gets
even more interesting.
If you go back to beginning-of-line, then press 'e' 3 times (so the cursor
is beyond the 'G'), then press 'i' ' ' to insert a space character, the
multi-byte character gets broken up. What I see is this:
wooledg:~$ ls bulk32� � grids.dat COPYING
So, it seems the space was inserted in the middle of the byte sequence
that constituted the ³ character (0xc2 0xb3) originally, resulting in
two invalid-character bytes with a space in the middle.
This is in LANG=en_US.UTF-8 on Debian 9 amd64.