[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: ignoring control characters in character width
From: |
Patrice Dumas |
Subject: |
Re: ignoring control characters in character width |
Date: |
Tue, 5 Sep 2023 20:19:40 +0200 |
On Tue, Sep 05, 2023 at 09:09:18PM +0300, Eli Zaretskii wrote:
> > Date: Tue, 5 Sep 2023 20:01:53 +0200
> > From: Patrice Dumas <pertusus@free.fr>
> >
> > Currently, when counting the width of a line of character, we count
> > control characters that are also spaces as having a width of 1. I think
> > that it is not good, as control characters either should not have a
> > width, for end of line, form feed, carriage return, or have a width that
> > is not well defined for vertical and horizontal tab. I suggest to
> > consider all the control characters as having a width of 0. This will
> > be consistent with libunistring u8_strwidth, which I intend to use in C
> > code equivalent to perl code.
>
> Please define "control characters" for this purpose. Some of them are
> definitely not zero-width, for example, TAB.
Characters whose unicode codepoints in decimal are in the range 0 to 31,
and also 127 (Delete). This includes the horizontal tab. It
corresponds to the [:cntrl:] character class.
> Also, depending on how control characters are displayed, their width
> could be even 4, for example if they are displayed as \nnn octal
> escapes.
It is in a context where they are displayed as encoded bytes.
--
Pat