emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Better emoji support


From: Robert Pluim
Subject: Re: Better emoji support
Date: Sun, 19 Sep 2021 20:40:11 +0200

>>>>> On Sun, 19 Sep 2021 21:29:44 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Robert Pluim <rpluim@gmail.com>
    >> Cc: Juri Linkov <juri@linkov.net>,  lists@traduction-libre.org,
    >> emacs-devel@gnu.org
    >> Date: Sun, 19 Sep 2021 20:10:22 +0200
    >> 
    Eli> Hmm...  Robert, I see quite a few characters that now belong to the
    Eli> emoji script, which shouldn't be there, AFAIU.  The above is one of
    Eli> them (AFAIK, the Arrows block doesn't belong to Emoji).  But there are
    Eli> more stark cases, for example:
    >> 
    >> The whole block might not, but some of the codepoints do:
    >> 
    >> 2194..2199    ; Emoji                # E0.6   [6] (↔️..↙️)    left-right 
arrow..down-left arrow

    Eli> Only if followed by a variation selector VS-16, right?

Iʼm inclined to agree, but Iʼd have to re-read tr51, and I have a
headache. They definitely have Emoji_Presentation=No.

    Eli> (aref char-script-table ?#) => emoji
    Eli> (aref char-script-table ?0) => emoji
    >> 
    >> I donʼt see that here (and itʼs definitely not the
    >> intention). Blocks.awk skips any ASCII codepoints (and those both
    >> evaluate to "latin" here). Could you double-check your
    >> lisp/international/charscript.el?

    Eli> I see them there:

    Eli>     (#x0023 #x0023 emoji) ; Autogenerated emoji
    Eli>     (#x002A #x002A emoji) ; Autogenerated emoji
    Eli>     (#x0030 #x0039 emoji) ; Autogenerated emoji
    Eli>     (#x00A9 #x00A9 emoji) ; Autogenerated emoji
    Eli>     (#x00AE #x00AE emoji) ; Autogenerated emoji

    Eli> Which corresponds to these lines in emoji-data.txt:

    Eli>   0023          ; Emoji                # E0.0   [1] (#️)       hash 
sign
    Eli>   002A          ; Emoji                # E0.0   [1] (*️)       asterisk
    Eli>   0030..0039    ; Emoji                # E0.0  [10] (0️..9️)    digit 
zero..digit nine
    Eli>   00A9          ; Emoji                # E0.6   [1] (©️)       
copyright
    Eli>   00AE          ; Emoji                # E0.6   [1] (®️)       
registered

Blocks.awk has this:

FILENAME ~ "emoji-data.txt" && /^00[0-9A-F]{2}.*; Emoji / {
    next
}

so those should have been filtered out (this is where I learn more
about Awk incompatibilities than I care to, perhaps)

    Eli> It seems like these characters ended up in the emoji script because
    Eli> they should render as emoji when followed by variation selectors?  But
    Eli> in that case, the place to do this is in composition-function-table,
    Eli> if we can, and if we cannot, let's for now decide we don't support
    Eli> these sequences, because the cure sounds worse than the disease with
    Eli> our current infrastructure.
    >> 
    Eli> Am I missing something?
    >> 
    >> Are now saying that we only want to add to the emoji script those
    >> characters with Emoji_Presentation=Yes?

    Eli> Yes, I think so.  Are there any downsides to that?

Not that I can see. As a side effect it will fix whatever is causing
those ASCII codepoints to be treated as Emoji for you.

Robert
-- 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]