[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Bash removes unrequested characters in bracket expressions (not a ra

From: Bize Ma
Subject: Re: Bash removes unrequested characters in bracket expressions (not a range).
Date: Sat, 24 Nov 2018 17:34:55 -0400

Chet Ramey (<address@hidden>) wrote:

> On 11/23/18 6:09 PM, Bize Ma wrote:
> > Bash Version: 4.4
> > Patch Level: 12
> > Release Status: release

> > Description:
> >
> > Bash is removing characters not explicitly listed in a bracket
> > expression (character range).
> > In this example, it is removing digits from other languages.
> What is your locale?
The locale used was en_US.utf-8 but also happens with  459
locales out of 868 available under Debian (not in C, for example).

Also in all locales affected (except one), setting either
LC_ALL=$loc or LC_COLLATE=$loc did the same.
Except in zh_CN.gb18030

But IMO locale collation should not be used for an explicit list.

I have been made aware that there is a
      cstart = cend = FOLD (cstart);
inside the `sm_loop.c` file that will convert into a range many
individual character. If that understanding is correct that is the
source of the difference with other shells.

I have the perception that a collation table *must have a "total order"*,
in fact, an strict total order. If two characters `a` and `b` could sort as
equal the order will fail to provide a confirmation that a character is
absent from the list. Consider characters `a`, `b` and `c`, if a and b
sort as equal, a sorted list in which we find `a` followed by `c` doesn't
confirm that `b` is absent as the order could well be `b a c`.

In this case, there must not be any other character than `a` in the
range `a-a` and using a range `a-a` is equivalent (just slower and
more complex) to the single character `a`.

If this is not the case, the error is in the collation table, not in using
single (faster) characters. And what should be updated is such
collation table IMO.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]