|
From: | Linda Walsh |
Subject: | Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z? |
Date: | Mon, 21 May 2012 20:19:31 -0700 |
User-agent: | Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.24) Gecko/20100228 Lightning/0.9 Thunderbird/2.0.0.24 Mnenhy/0.7.6.666 |
Eric Blake wrote:
They still don't make any sense in any locale except C, because POSIX no longer requires collating order.The regex(7) man page says that [xx-xx] uses ***collating order**::The regex(7) man page _of which system_? Just because _some_ systems (like glibc, picking the POSIX 1992 semantics) have well-defined semantics, doesn't mean that all systems have those same semantics. According to POSIX, you cannot portably assume ANY semantics for ranges except in the C locale. And if RRI gains traction, that means that you can assume ASCII collation, across ALL locales, but this is a different order than collation of a specific locale, and it is also a GNU extension not guaranteed by POSIX.
=== Well, that would be nice, but if Unicode takes off, *cough*, and anyone claims unicode compliance (isn't UTF-8 the standard for HTML5 and XML?), they are also guaranteed ordering -- full ordering for the full Unicode character set. It would be VERY GOOD if RRI didn't come up with an order that was DIFFERENT from that prescribed by Unicode -- otherwise that could open another can of worms.
[Prev in Thread] | Current Thread | [Next in Thread] |