bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: char-class rules & please show examples of int. locales that use dif


From: L A Walsh
Subject: Re: char-class rules & please show examples of int. locales that use diff. char-class rules
Date: Thu, 15 Jun 2017 14:05:33 -0700
User-agent: Thunderbird


Chet Ramey wrote:
On 6/15/17 3:04 PM, L A Walsh wrote:
Two problems with locale-based rules are:
   1) they differ based on local convention, potentially,
even down to what "side of the street" you live on, and
That's precisely what makes them valuable to users.
---
   But such differences also make them incompatible with other
locales including ASCII -- which would prohibit them running
most programs today.
   2) they don't account or allow for "data" (textual) outside
of a given locale.  For companies connected by an internet with
international customers, having a non-uniform standard is a
serious problem at best, and unworkable in practice.

We're not talking about `data' here. We're talking about characters that
can appear in shell identifier names. Don't try to muddy the issue.
----
   That wasn't trying to muddy the issue, but clarify it -- all text,
is ultimately a type of data.  How it is interpreted is key.  HTML
text partially solved it's encoding problem by having a default and
headers to specify codepages.  The same is true in the future, but the
default has changed from a western-encoding-default to UTF-8-default.

   At this point, I'm proposing Bash allow a similar scheme -- of
allowing UTF-8.  It can be ***some*** extension to add a codepage
definition to bash-scripts if there is a demand for it.  Given that
ASCII has sufficed for nearly 2 decades, adding support for all of the
world's languages via a compatible encoding doesn't seem to be
a onerous restriction.  Less than 50% of the locales can use the
current ASCII, while less than 4%  do not, _today_ support UTF-8.


That's already a problem in that I try to use a letter from
the Greek alphabet, in a var name, and it doesn't work.  The
current code doesn't recognize letters outside some limited
POSIX-defined range. That's very constraining.

Please. The entire scope of this discussion is how to lift that
constraint.
  And I proposed allowing a method that would not invalidate
current ASCII scripts.  Methods to run scripts under a non-ASCII
locale where bash applies locale-specific meanings to characters won't
run today's scripts -- they would *create* incompatibility.

   If those wanting to support incompatible locales want such support,
I don't see a future extension to support specifying a locale to be
a problem.  But that support shouldn't stop moving ahead with UTF-8
compatibility, as UTF-8 compatiblity won't conflict with such
an extension (or at least it hasn't in-regard to webpages).




reply via email to

[Prev in Thread] Current Thread [Next in Thread]