bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Bash vi mode's e command (end of word) goes to eol when hitting a un


From: Enrico Maria De Angelis
Subject: Re: Bash vi mode's e command (end of word) goes to eol when hitting a unicode character
Date: Tue, 4 Sep 2018 19:54:15 +0200

Ow,
I'm sorry for not having investigated further, since I thought it was kind
of expected.
Thank you for doing it, Greg.
Hope this will be fixed.
Kind regards,
Enrico Maria

Il giorno mar 4 set 2018 alle ore 15:28 Greg Wooledge <address@hidden>
ha scritto:

> On Mon, Sep 03, 2018 at 01:13:03PM +0200, Enrico Maria De Angelis wrote:
> > The version number of bash: GNU bash, version 4.4.23(1)-release
> > The hardware and operating system: Arch LInux (constatly update)
> > The compiler used to compile: I didn't compile bash myself
> > A description of the bug behaviour: & A short script or `recipe' which
> > exercises the bug:
> > While vi-editing a line like the following
> > $ ls bulk32³ grids.dat COPYING
> > with the cursor in normal mode at the beginning of the line, hitting e
> > repeatedly, cause the cursor to move in order to
> > s of ls (correct)
> > 2 of bulk32³ (correct, since Vim itself works like this, with an end of
> > word being detected in between 2 and ³)
> > end of line (wrong)
>
> I can confirm this in Debian's bash 4.4.12 and in bash 5.0-alpha.  It's
> actually worse than Enrico reports.
>
> First, the cursor doesn't actually move to the end-of-line character
> ('G').  The cursor moves one space *past* that.
>
> Once there, pressing either 'h' or 'b' moves the cursor from end-of-line
> back to the ³ character.  That's fairly odd on its, own, but it gets
> even more interesting.
>
> If you go back to beginning-of-line, then press 'e' 3 times (so the cursor
> is beyond the 'G'), then press 'i' ' ' to insert a space character, the
> multi-byte character gets broken up.  What I see is this:
>
> wooledg:~$ ls bulk32� � grids.dat COPYING
>
> So, it seems the space was inserted in the middle of the byte sequence
> that constituted the ³ character (0xc2 0xb3) originally, resulting in
> two invalid-character bytes with a space in the middle.
>
> This is in LANG=en_US.UTF-8 on Debian 9 amd64.
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]