Re: char-class rules & please show examples of int. locales that use dif

bug-bash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: char-class rules & please show examples of int. locales that use dif

From:	L A Walsh
Subject:	Re: char-class rules & please show examples of int. locales that use diff. char-class rules
Date:	Thu, 15 Jun 2017 14:05:33 -0700
User-agent:	Thunderbird


Chet Ramey wrote:

On 6/15/17 3:04 PM, L A Walsh wrote:

Two problems with locale-based rules are:
   1) they differ based on local convention, potentially,
even down to what "side of the street" you live on, and

That's precisely what makes them valuable to users.

---
   But such differences also make them incompatible with other
locales including ASCII -- which would prohibit them running
most programs today.

   2) they don't account or allow for "data" (textual) outside
of a given locale.  For companies connected by an internet with
international customers, having a non-uniform standard is a
serious problem at best, and unworkable in practice.


We're not talking about `data' here. We're talking about characters that
can appear in shell identifier names. Don't try to muddy the issue.

----
   That wasn't trying to muddy the issue, but clarify it -- all text,
is ultimately a type of data.  How it is interpreted is key.  HTML
text partially solved it's encoding problem by having a default and
headers to specify codepages.  The same is true in the future, but the
default has changed from a western-encoding-default to UTF-8-default.

   At this point, I'm proposing Bash allow a similar scheme -- of
allowing UTF-8.  It can be ***some*** extension to add a codepage
definition to bash-scripts if there is a demand for it.  Given that
ASCII has sufficed for nearly 2 decades, adding support for all of the
world's languages via a compatible encoding doesn't seem to be
a onerous restriction.  Less than 50% of the locales can use the
current ASCII, while less than 4%  do not, _today_ support UTF-8.

That's already a problem in that I try to use a letter from
the Greek alphabet, in a var name, and it doesn't work.  The
current code doesn't recognize letters outside some limited

POSIX-defined range. That's very constraining.


Please. The entire scope of this discussion is how to lift that
constraint.

  And I proposed allowing a method that would not invalidate
current ASCII scripts.  Methods to run scripts under a non-ASCII
locale where bash applies locale-specific meanings to characters won't
run today's scripts -- they would *create* incompatibility.

   If those wanting to support incompatible locales want such support,
I don't see a future extension to support specifying a locale to be
a problem.  But that support shouldn't stop moving ahead with UTF-8
compatibility, as UTF-8 compatiblity won't conflict with such
an extension (or at least it hasn't in-regard to webpages).

[Prev in Thread]

Current Thread

[Next in Thread]

Re: people working in Greg's locale (+euro) & display of Unicode names, (continued)
- Re: RFE: Please allow unicode ID chars in identifiers, Chet Ramey, 2017/06/13
  - Re: RFE: Please allow unicode ID chars in identifiers, L A Walsh, 2017/06/13
    - Re: RFE: Please allow unicode ID chars in identifiers, Chet Ramey, 2017/06/13

Prev by Date: Re: people working in Greg's locale (+euro) & display of Unicode names
Next by Date: Re: people working in Greg's locale (+euro) & display of Unicode names
Previous by thread: Re: char-class rules & please show examples of int. locales that use diff. char-class rules
Next by thread: Re: RFE: Please allow unicode ID chars in identifiers
Index(es):
- Date
- Thread