bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#39799: 28.0.50; Most emoji sequences don’t render correctly


From: Robert Pluim
Subject: bug#39799: 28.0.50; Most emoji sequences don’t render correctly
Date: Fri, 28 Feb 2020 13:21:59 +0100

>>>>> On Fri, 28 Feb 2020 10:25:22 +0200, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Mike FABIAN <mfabian@redhat.com>
    >> Cc: 39799@debbugs.gnu.org
    >> Date: Fri, 28 Feb 2020 08:36:10 +0100
    >> 
    >> > Patches are welcome to convert the emoji-related files in Unicode's
    >> > character database into appropriate composition-function-table setup,
    >> > similar to the example above.  Some script to be run at Emacs build
    >> > time and produce, say, lisp/emoji.el to populate
    >> > composition-function-table, would be nice (see the Awk scripts in
    >> > admin/unidata as one source of inspiration).
    >> 
    >> Pango also has a .c file which is generated by a python script from
    >> the Unicode emoji data files to make all these sequences known to Pango.
    >> 
    >> I can try to write a script. Would it be OK to use Python for such a
    >> script generating emoji.el?

    Eli> I'd prefer not to add Python as prerequisite for building Emacs.  We
    Eli> already use Awk, so using that'd be fine.

I suck at awk, but my attempt is attached. It DTRT for me under Cairo
if I change my fontset settings to use 'Noto Color Emoji' instead of
Symbola for:

             (#x1F300 . #x1F5FF)        ;; Misc Symbols and Pictographs
             (#x1F900 . #x1F9FF)        ;; Supplemental Symbols and Pictographs

It matches forward off the first char, so the
composition-function-table entries all have '0' as the number of chars
to match. Would it be better to match backwards? Weʼd run into the
4-character maximum for that, since some of the sequences are 7 or
more characters long.

    >> > If you mean they are not displayed in correct colors, then Emacs
    >> > doesn't yet support color emoji, we lack some infrastructure for
    >> > that.  Again, work in that area is welcome, it should be relatively
    >> > easy since we now have HarfBuzz support for text shaping.
    >> 
    >> Actually the color display works already. I tested with current master
    >> (build with cairo) and the emoji display just fine in color.

    Eli> Maybe in a Cairo build.  Or maybe I'm missing something.

Iʼm not seeing colour emoji in a -Q Cairo build. Which sequence is this
again?

Robert

#!/usr/bin/awk -f

## Copyright (C) 2020 Free Software Foundation, Inc.

## Author: Robert Pluim <rpluim@gmail.com>

## This file is part of GNU Emacs.

## GNU Emacs is free software: you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation, either version 3 of the License, or
## (at your option) any later version.

## GNU Emacs is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
## GNU General Public License for more details.

## You should have received a copy of the GNU General Public License
## along with GNU Emacs.  If not, see <https://www.gnu.org/licenses/>.

### Commentary:

## This script takes as input Unicode's emoji-zwj-sequences.txt
## (https://www.unicode.org/Public/emoji/12.0/emoji-zwj-sequences.txt)
## and produces output for Emacs's lisp/international/emoji-zwj.el.

## For additional details, see <https://debbugs.gnu.org/39799#8>.

## Things to do after installing a new version of emoji-zwj-sequences.txt:
## Check the output against the old output.

### Code:

/^[0-9A-F]/ {
    sub(/  *;.*/, "", $0)
    num = split($0, elts)
    if (ch[elts[1]] == "")
    {
        vec[elts[1]] = ""
        ch[elts[1]] = elts[1]
    }
    else
    {
        vec[elts[1]] = vec[elts[1]] " "
    }
        vec[elts[1]] = vec[elts[1]] "\""
    for (j = 1; j <= num; j++)
    {
        c = sprintf("\\N{U+%s}", elts[j])
        vec[elts[1]] = vec[elts[1]] c
    }
    vec[elts[1]] = vec[elts[1]] "\""
}

END {
    print ";;; emoji-zwj.el --- emoji zwj character composition table"
    print ";;; Automatically generated from 
admin/unidata/emoji-zwj-sequences.txt"
    print "(dolist (elt '("

    for (elt in ch)
    {
        printf("(#x%s . (%s))\n", elt, vec[elt])
}
    print "    ))"
    print "  (set-char-table-range composition-function-table"
    print "                        (car elt)"
    print "                        (list (vector (regexp-opt (cdr elt))"
    print "                                      0"
    print "                                      
'compose-gstring-for-graphic))))"
    print "\n"
    print "(provide 'emoji-zwj)"
}

Attachment: emoji-zwj.el
Description: application/emacs-lisp


reply via email to

[Prev in Thread] Current Thread [Next in Thread]