[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?
From: |
Paolo Bonzini |
Subject: |
Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z? |
Date: |
Thu, 27 Jun 2013 15:27:40 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130514 Thunderbird/17.0.6 |
Il 27/06/2013 14:11, Johannes Meixner ha scritto:
>
> Hello,
>
> On Jun 27 10:48 Paolo Bonzini wrote (excerpt):
>> Il 27/06/2013 09:33, Aharon Robbins ha scritto:
>>>
>>> Fortunately, gawk and grep are already there, and I think the sed in
>>> the git repo is as well. Once Bash turns this on as default, the
>>> world will definitely be a better place, independent of GLIBC.
>>
>> I already explained this multiple times how this is completely
>> delusional.
>>
>> 1) grep, sed, coreutils and so on will only use representation-based
>> range interpretation (I prefer this more neutral term that also explains
>> what's going on) if you use gnulib's regex implementation. And by
>> default, they use glibc (I just checked grep).
>>
>> 2) Even if you switched the default, you would be at the mercy of
>> distros. Distros prefer to avoid glibc replacements in single packages,
>> because then all bugs have to be fixed in many different places. In
>> fact, I checked grep and Fedora builds it with --without-included-regex.
>
>
> Right now I checked how grep is built in openSUSE via
> "configure --disable-silent-rules --without-included-regex"
Right thing to do, if you ask me...
> I do not care too much which kind of locale specific ordering
> or collating or regex behaviour is actually implemented
> as long as it works consistently in grep, gawk, sed, bash,...
>
> I would very much appreciate it if grep, gawk, sed, bash,...
> could agree on one same behaviour and provide clear
> documentation for those who compile it what the
> "commonly accepted upstream behaviour" is so that
> the binaries get built with that same behaviour
> by all distributors who like to be in compliance
> with upstream decisions.
Right now only gawk is different from the others, and not in a very
clean manner:
#ifndef GAWK
/* Defer to the system regex library about the meaning
of range expressions. */
regex_t re;
char pattern[6] = { '[', 0, '-', 0, ']', 0 };
char subject[2] = { 0, 0 };
c1 = c;
if (case_fold)
{
c1 = tolower (c1);
c2 = tolower (c2);
}
pattern[1] = c1;
pattern[3] = c2;
regcomp (&re, pattern, REG_NOSUB);
for (c = 0; c < NOTCHAR; ++c)
{
if ((case_fold && isupper (c))
|| (MB_CUR_MAX > 1 && btowc (c) == WEOF))
continue;
subject[0] = c;
if (regexec (&re, subject, 0, NULL, 0) != REG_NOMATCH)
setbit_case_fold_c (c, ccl);
}
regfree (&re);
#else
c1 = c;
if (case_fold)
{
c1 = tolower (c1);
c2 = tolower (c2);
}
for (c = c1; c <= c2; c++)
setbit_case_fold_c (c, ccl);
#endif
I would suggest distros to rip out the #else part of this #ifndef.
Paolo
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Pádraig Brady, 2013/06/26
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Chet Ramey, 2013/06/26
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Paolo Bonzini, 2013/06/27
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Aharon Robbins, 2013/06/27
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Paolo Bonzini, 2013/06/27
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Johannes Meixner, 2013/06/27
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?,
Paolo Bonzini <=
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Aharon Robbins, 2013/06/27
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Paolo Bonzini, 2013/06/27
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Aharon Robbins, 2013/06/27
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Paolo Bonzini, 2013/06/27
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Chet Ramey, 2013/06/27
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Eli Zaretskii, 2013/06/27
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Chet Ramey, 2013/06/27
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Paolo Bonzini, 2013/06/28
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Johannes Meixner, 2013/06/28
- Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?, Eli Zaretskii, 2013/06/28