[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Unicode range and enumeration support.
From: |
Eli Schwartz |
Subject: |
Re: Unicode range and enumeration support. |
Date: |
Sun, 22 Dec 2019 01:38:13 -0500 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.3.0 |
On 12/20/19 7:35 PM, L A Walsh wrote:
> On 2019/12/18 11:46, Greg Wooledge wrote:
>> To put it another way: you can write code that determines whether
>> an input character $c matches a glob or regex like [Z-a]. (Maybe.)
>>
>> But, you CANNOT write code to generate all of the characters from Z to a
>>
> This generates characters from decimal 8300 - 8400 (because that range
> includes raised and lowered digits which have the number and value
> properties equivalent to 0-9.
>
> ----
>
> No? 8300, 8400 arbitrary code points that contain raised and lowered
> numbers
> that have the number property (as does 0..9):
>
> perl -we' use strict; use v5.16;
> my $c;
> for ($c=8300;$c<8400;++$c) {
> my $o=chr $c;
> printf "%s", $o if $o=~/\pN/; #match unicode property "is_num"
> };printf "\n"'
> ⁰⁴⁵⁶⁷⁸⁹₀₁₂₃₄₅₆₇₈₉
>
> Q.E.D.
>
>
> Is that sufficient proof?
It's sufficient proof that you're wrong, yes.
Given the discussion was about collation, not simply enumerating
codepoints in order of their codepoint values, it would be helpful to
actually, you know, collate them.
Given your sample text range:
$ printf %s\\n ⁰ ⁴ ⁵ ⁶ ⁷ ⁸ ⁹ ₀ ₁ ₂ ₃ ₄ ₅ ₆ ₇ ₈ ₉ | sort
⁰
₀
₁
₂
₃
⁴
₄
⁵
₅
⁶
₆
⁷
₇
⁸
₈
⁹
₉
This is plainly not in byte order.
Now you need to ask yourself the question: which locale do you want to
sort according to? I used en_US.UTF-8. Please don't say "C.UTF-8",
because that's not actually a thing. And the plain C locale won't work
for obvious reasons...
--
Eli Schwartz
Arch Linux Bug Wrangler and Trusted User
signature.asc
Description: OpenPGP digital signature
- Re: unquoted expansion not working (was Re: Not missing, but very hard to see), (continued)
- Re: unquoted expansion not working (was Re: Not missing, but very hard to see), Greg Wooledge, 2019/12/13
- Re: unquoted expansion not working (was Re: Not missing, but very hard to see), L A Walsh, 2019/12/14
- Re: unquoted expansion not working (was Re: Not missing, but very hard to see), Eli Schwartz, 2019/12/15
- Re: unquoted expansion not working (was Re: Not missing, but very hard to see), Greg Wooledge, 2019/12/16
- Unicode range and enumeration support., L A Walsh, 2019/12/18
- Re: Unicode range and enumeration support., Greg Wooledge, 2019/12/18
- Re: Unicode range and enumeration support., Eli Schwartz, 2019/12/18
- Re: Unicode range and enumeration support., Greg Wooledge, 2019/12/18
- Re: Unicode range and enumeration support., Eli Schwartz, 2019/12/18
- Re: Unicode range and enumeration support., L A Walsh, 2019/12/20
- Re: Unicode range and enumeration support.,
Eli Schwartz <=
- Re: Unicode range and enumeration support., L A Walsh, 2019/12/23
- Re: Unicode range and enumeration support., Greg Wooledge, 2019/12/23
- Re: Unicode range and enumeration support., L A Walsh, 2019/12/24
- Re: Unicode range and enumeration support., Eli Schwartz, 2019/12/24
- Re: Unicode range and enumeration support., Robert Elz, 2019/12/24
- Re: Unicode range and enumeration support., Eli Schwartz, 2019/12/24
- Re: Unicode range and enumeration support., Stephane Chazelas, 2019/12/25
- Re: Unicode range and enumeration support., Robert Elz, 2019/12/24
- Re: Unicode range and enumeration support., Greg Wooledge, 2019/12/23
- Re: Unicode range and enumeration support., L A Walsh, 2019/12/23