bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#39799: 28.0.50; Most emoji sequences don’t render correctly


From: Eli Zaretskii
Subject: bug#39799: 28.0.50; Most emoji sequences don’t render correctly
Date: Tue, 21 Sep 2021 12:16:38 +0300

> From: Robert Pluim <rpluim@gmail.com>
> Cc: rgm@gnu.org,  39799@debbugs.gnu.org,  mfabian@redhat.com
> Date: Mon, 20 Sep 2021 22:38:28 +0200
> 
> Iʼve just pushed a change to master that should fix (almost) all the
> issues with displaying emoji sequences (except for keycaps). Feedback
> welcome.

Thanks, this is mostly okay, IMO.  the only issue I have with this is
here:

> --- a/admin/unidata/blocks.awk
> +++ b/admin/unidata/blocks.awk
> @@ -221,6 +221,46 @@ FILENAME ~ "emoji-data.txt" && /^[0-9A-F].*; 
> Emoji_Presentation / {
>  }
>  
>  END {
> +    ## These codepoints have Emoji_Presentation = No, but they are
> +    ## used in emoji-sequences.txt and emoji-zwj-sequences.txt (with a
> +    ## Variation Selector), so force them into the emoji script so
> +    ## they will get composed correctly.  FIXME: delete this when we
> +    ## can change the font used for a codepoint based on whether it's
> +    ## followed by a VS (usually VS-16)
> +    idx = 0
> +    override_start[idx] = "261D"
> +    override_end[idx] = "261D"
> +    idx++
> +    override_start[idx] = "26F9"
> +    override_end[idx] = "26F9"
> +    idx++
> +    override_start[idx] = "270C"
> +    override_end[idx] = "270D"
> +    idx++
> +    override_start[idx] = "2764"
> +    override_end[idx] = "2764"
> +    idx++
> +    override_start[idx] = "1F3CB"
> +    override_end[idx] = "1F3CC"
> +    idx++
> +    override_start[idx] = "1F3F3"
> +    override_end[idx] = "1F3F4"
> +    idx++
> +    override_start[idx] = "1F441"
> +    override_end[idx] = "1F441"
> +    idx++
> +    override_start[idx] = "1F575"
> +    override_end[idx] = "1F575"
> +
> +    for (k in override_start)
> +    {
> +        i++
> +        start[i] = override_start[k]
> +        end[i] = override_end[k]
> +        alt[i] = "emoji"
> +        name[i] = "Autogenerated emoji (override)"
> +    }

Specifically, the U+2xxx codepoints are now in the 'emoji' script,
which I think is undesirable, even if the price is that we won't
support the sequences in which those codepoints are followed by
VS-16.  So I think we should remove those codepoints from the above,
leaving only the U+1Fxxx" ones.

Btw, currently U+261D followed by VS-16 doesn't compose for me,
probably because compose-gstring-for-variation-glyph is hardcoded to
work only for Han characters, and U+261D isn't, or because that
function is not suited to VS-16 (it looks for glyph variations in the
font)?  Or am I missing something?

Now to my idea of supporting those "U+2xxx VS-16" sequences without
assigning them to the 'emoji' script:

The function autocmp_chars uses font_range to find whether the
sequence of characters that can be composed are supported by the same
font.  It currently takes the first character of the sequence, calls
font_for_char for it, then checks that all the rest of the characters
are supported by that font by calling font_encode_char.  In our case,
the first character of the sequence is U+2xxx, which is not in the
'emoji' script, so Emacs is likely to pick up a font that doesn't
support Emoji, and the composition will fail.  To avoid that, I
propose the following change:

  . add a new argument to font_range, the codepoint that triggered the
    composition
  . inside font_range, if that codepoint belongs to the 'emoji' script
    (use char-script-table to find that out), call font_for_char with
    a representative character for 'emoji' (from
    script-representative-chars) instead of the first character of the
    sequence, then check that all the sequence characters, including
    the first one, can be supported by that font; if they can, return
    that font to the caller, to be used for the composition

WDYT?

Btw, if you use Firefox or Chrome, or some other application that can
show Emoji sequences, or maybe just use HarfBuzz's hb-view, how does
the display of the U+2xxx changes when they are followed by VS-16?  Is
the change prominent enough for us to try to support it?  If not,
perhaps the above should be left out for the moment.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]