[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?
From: |
Chet Ramey |
Subject: |
Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z? |
Date: |
Thu, 27 Jun 2013 15:13:12 -0400 |
User-agent: |
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:17.0) Gecko/20130509 Thunderbird/17.0.6 |
On 6/27/13 4:48 AM, Paolo Bonzini wrote:
> Il 27/06/2013 09:33, Aharon Robbins ha scritto:
>> Hi Paolo.
>>
>>>> I still believe that there is no place other than the glibc locale
>>>> descriptions where this can be fixed.
>> This is necessary but not sufficient. All of gawk, grep, sed and bash
>> run on lots of non-GLIBC systems.
>
> On non-glibc systems they use gnulib's regex implementation, so they're
> fine.
You presume much. Bash, for instance, doesn't use a regex implementation,
especially not gnulib's. gnulib code is, in practice, difficult to use on
an individual module basis, and doesn't provide enough of a benefit to go
through the effort of breaking it out of gnulib and putting it into bash.
>
>> The locale definitions, even for
>> the same locale, vary wildly out in the wild. Therefore there's no
>> other practical choice but to fix each program to provide Rational
>> Range Interpretation.
>>
>> Fortunately, gawk and grep are already there, and I think the sed in
>> the git repo is as well. Once Bash turns this on as default, the
>> world will definitely be a better place, independent of GLIBC.
>
> I already explained this multiple times how this is completely delusional.
A little bit strong, no? If you use your own matching code, it's a small
matter to change strcoll to strcmp.
> 1) grep, sed, coreutils and so on will only use representation-based
> range interpretation (I prefer this more neutral term that also explains
> what's going on) if you use gnulib's regex implementation. And by
> default, they use glibc (I just checked grep).
>
> 2) Even if you switched the default, you would be at the mercy of
> distros. Distros prefer to avoid glibc replacements in single packages,
> because then all bugs have to be fixed in many different places. In
> fact, I checked grep and Fedora builds it with --without-included-regex.
There are systems of interest besides Linux and its distros.
> Not to mention how this is entirely Latin-centric. There are some
> encodings in which there is absolutely no relation between the encoding
> and the expected collation order.
And there's no portable way to obtain this information in any case, glibc
or not. So if this is to be `fixed' only either by changing every locale
definition everywhere or changing the matching code, I vote for changing
the matching code. We just have to agree on an interpretation and make
sure the various matchers agree.
Chet
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRU chet@case.edu http://cnswww.cns.cwru.edu/~chet/
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, (continued)
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Johannes Meixner, 2013/06/28
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Eli Zaretskii, 2013/06/28
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Paolo Bonzini, 2013/06/28
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Eli Zaretskii, 2013/06/28
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Paolo Bonzini, 2013/06/28
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Eli Zaretskii, 2013/06/28
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Paolo Bonzini, 2013/06/28
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Eli Zaretskii, 2013/06/28
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Eric Blake, 2013/06/28
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Paul Eggert, 2013/06/28
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?,
Chet Ramey <=
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Paolo Bonzini, 2013/06/28
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Chet Ramey, 2013/06/28
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Chet Ramey, 2013/06/27
- Re: locale specific ordering in EN_US vs. characterset collation rules for UTF-8, Linda Walsh, 2013/06/28
- Re: locale specific ordering in EN_US vs. characterset collation rules for UTF-8, Paolo Bonzini, 2013/06/28
- Re: locale specific ordering in EN_US vs. characterset collation rules for UTF-8, Linda Walsh, 2013/06/28
- Re: locale specific ordering in EN_US vs. characterset collation rules for UTF-8, Chet Ramey, 2013/06/28