[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: ignoring control characters in character width
From: |
Eli Zaretskii |
Subject: |
Re: ignoring control characters in character width |
Date: |
Tue, 05 Sep 2023 22:06:13 +0300 |
> Date: Tue, 5 Sep 2023 20:19:40 +0200
> From: Patrice Dumas <pertusus@free.fr>
> Cc: bug-texinfo@gnu.org
>
> On Tue, Sep 05, 2023 at 09:09:18PM +0300, Eli Zaretskii wrote:
> > > Date: Tue, 5 Sep 2023 20:01:53 +0200
> > > From: Patrice Dumas <pertusus@free.fr>
> > >
> > > Currently, when counting the width of a line of character, we count
> > > control characters that are also spaces as having a width of 1. I think
> > > that it is not good, as control characters either should not have a
> > > width, for end of line, form feed, carriage return, or have a width that
> > > is not well defined for vertical and horizontal tab. I suggest to
> > > consider all the control characters as having a width of 0. This will
> > > be consistent with libunistring u8_strwidth, which I intend to use in C
> > > code equivalent to perl code.
> >
> > Please define "control characters" for this purpose. Some of them are
> > definitely not zero-width, for example, TAB.
>
> Characters whose unicode codepoints in decimal are in the range 0 to 31,
> and also 127 (Delete). This includes the horizontal tab. It
> corresponds to the [:cntrl:] character class.
Then I guess I still don't understand: how is TAB a zero-width
character?
> > Also, depending on how control characters are displayed, their width
> > could be even 4, for example if they are displayed as \nnn octal
> > escapes.
>
> It is in a context where they are displayed as encoded bytes.
So what is the context of this discussion, if it is not display of
bytes? I really don't understand, could you elaborate?
Control characters can also be displayed as ^C, for example, in which
case they take 2 columns.