bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?


From: Linda Walsh
Subject: Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?
Date: Mon, 21 May 2012 12:37:26 -0700
User-agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.24) Gecko/20100228 Lightning/0.9 Thunderbird/2.0.0.24 Mnenhy/0.7.6.666



Greg Wooledge wrote:

On Mon, May 21, 2012 at 12:19:26PM -0700, Linda Walsh wrote:
Greg Wooledge wrote:
For instance, on HP-UX 10.20, in the en_US.iso88591 locale:
   A  a  ...  B  b
Meanwhile, on Debian 6.0, in the en_US.iso88591 locale:
   a A   ...  b B

So which is correct?

Both.  Locale collating order is determined by the OS.  You cannot
rely on it, unless you set the LC_COLLATE variable to "C" or "POSIX",
in which case you get ASCII behavior (accented letters are not part
of the character set at all).

Anyone wanting to reference an upper or lower case range
[a-z] or [A-Z], is gonna hurt from this.

Correct.

----
        This is a prime example of Posix being stupid and bad for
computer science.

        They take a deterministic behavior and define it to be
non-deterministic and break 1000's of programs.

        They cannot justify this... as they are supposed to document
current practice -- which has never been to consider the interpretation
of a-z/A-Z as random!

        Thus they are violating their own rules!    How can anyone follow
such lame directions?  Who in their right mind would have voted to
make ranges "worthless"....i.e. -- established, standard practice has never
been for such ranges to be worthless -- yet that is exactly what they
voted for.

        How is posix following it's own rules?   If they don't follow
their own rules -- how can anyone be following these new specifications
which are obviously in conflict with established implementation?







reply via email to

[Prev in Thread] Current Thread [Next in Thread]