|
From: | Linda Walsh |
Subject: | Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z? |
Date: | Mon, 21 May 2012 12:37:26 -0700 |
User-agent: | Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.24) Gecko/20100228 Lightning/0.9 Thunderbird/2.0.0.24 Mnenhy/0.7.6.666 |
Greg Wooledge wrote:
On Mon, May 21, 2012 at 12:19:26PM -0700, Linda Walsh wrote:Greg Wooledge wrote:For instance, on HP-UX 10.20, in the en_US.iso88591 locale: A a ... B b Meanwhile, on Debian 6.0, in the en_US.iso88591 locale: a A ... b BSo which is correct?Both. Locale collating order is determined by the OS. You cannot rely on it, unless you set the LC_COLLATE variable to "C" or "POSIX", in which case you get ASCII behavior (accented letters are not part of the character set at all).Anyone wanting to reference an upper or lower case range [a-z] or [A-Z], is gonna hurt from this.Correct.
---- This is a prime example of Posix being stupid and bad for computer science. They take a deterministic behavior and define it to be non-deterministic and break 1000's of programs. They cannot justify this... as they are supposed to document current practice -- which has never been to consider the interpretation of a-z/A-Z as random! Thus they are violating their own rules! How can anyone follow such lame directions? Who in their right mind would have voted to make ranges "worthless"....i.e. -- established, standard practice has never been for such ranges to be worthless -- yet that is exactly what they voted for. How is posix following it's own rules? If they don't follow their own rules -- how can anyone be following these new specifications which are obviously in conflict with established implementation?
[Prev in Thread] | Current Thread | [Next in Thread] |