bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#54562: 28.0.91; Emoji sequence not composed


From: Robert Pluim
Subject: bug#54562: 28.0.91; Emoji sequence not composed
Date: Mon, 28 Mar 2022 14:46:09 +0200

>>>>> On Mon, 28 Mar 2022 14:51:49 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Robert Pluim <rpluim@gmail.com>
    >> Cc: Lars Ingebrigtsen <larsi@gnus.org>,  54562@debbugs.gnu.org,  Eli
    >> Zaretskii <eliz@gnu.org>
    >> Date: Mon, 28 Mar 2022 09:47:54 +0200
    >> 
    >> OK. So it sounds like we should perhaps look at doing composition for
    >> the codepoints in that block by doing face lookup based on the
    >> combining character rather than the base character.

    Eli> I guess we should try.  It should be optional behavior, because Emacs
    Eli> never did that, and I cannot predict what will that do to all the
    Eli> different use cases where we compose text, and thus whether users will
    Eli> like that in all the cases.  It could, for example, mean that a
    Eli> particular Latin character with a diacritic will be displayed with a
    Eli> font that's different from the rest of the Latin text, which some
    Eli> users might consider worse than seeing just the base character in the
    Eli> "expected" font.  And that's just the simplest use case.

Yes, thatʼs exactly what happens with U+0308 here sometimes, see
screenshot below. I had to search a bit to find a font to use as the
default that didnʼt have a glyph for U+0308, so Iʼm not sure how
important this issue is in practice.

    Eli> And I think "based on combining character" is not the correct
    Eli> definition.  We should allow selection of the font based on the
    Eli> character that triggered the composition, i.e. the character whose
    Eli> slot in composition-function-table stores the rule which we are using
    Eli> to produce the composition.  Like we already do for Emoji.  For
    Eli> combining characters, the default is that the combining character is
    Eli> that trigger.  By contrast, today we use the font for the first
    Eli> character in the composition sequence (NOT the base character, as I
    Eli> incorrectly wrote earlier, although in practice it is the same for
    Eli> Latin).

Imprecise wording on my part. It would indeed be the triggering
character, as with emoji.

    >> Eli, should we look at doing that for other combining characters,
    >> such as Andreas' 0308?

    Eli> "Look at" in what sense?

'consider'

Rough patch attached. It does U+20E3, U+0308, and U+20D0..U+20FF. It
works kind of ok, but U+006F U+0308 suffers from the font problem you
were worried about. With Bitstream Vera Mono, the composed glyph ends
up being from Latin Modern Roman, which looks very different.

The composed glyphs for U+20D0..U+20FF look pretty bad in all the
fonts Iʼve tried so far: Unifont, FreeSans, Free Mono, Menlo,
Bitstream Vera Mono. Does anyone have an idea of a good font for
those?

Robert
-- 

PNG image

diff --git i/admin/unidata/emoji-zwj.awk w/admin/unidata/emoji-zwj.awk
index 3d605d5d64..331095d56f 100644
--- i/admin/unidata/emoji-zwj.awk
+++ w/admin/unidata/emoji-zwj.awk
@@ -69,6 +69,7 @@ END {
      # emoji sequences.  We have code in font.c:font_range that will
      # try to display them with the emoji font anyway.
 
+     trigger_codepoints[0] = "20E3"
      trigger_codepoints[1] = "261D"
      trigger_codepoints[2] = "26F9"
      trigger_codepoints[3] = "270C"
diff --git i/src/font.c w/src/font.c
index 7e0219181c..265bec6ce5 100644
--- i/src/font.c
+++ w/src/font.c
@@ -3937,6 +3937,14 @@ codepoint_is_emoji_eligible (int ch)
   return false;
 }
 
+static bool
+codepoint_is_combining_lookup_eligible (int ch)
+{
+  if ((0x20D0 <= ch && ch <= 0x20FF) || ch == 0x308)
+    return true;
+  return false;
+}
+
 /* Check how many characters after character/byte position POS/POS_BYTE
    (at most to *LIMIT) can be displayed by the same font in the window W.
    FACE, if non-NULL, is the face selected for the character at POS.
@@ -3996,6 +4004,13 @@ font_range (ptrdiff_t pos, ptrdiff_t pos_byte, ptrdiff_t 
*limit,
            val = AREF (val, 0);
          font_object = font_for_char (face, XFIXNAT (val), pos, string);
        }
+    } else if (codepoint_is_combining_lookup_eligible (ch))
+  /* If the triggering codepoint is a combining character, use the
+     font of that character rather than the font of the base
+     character, since that increases the chances of composition
+     working.  */
+    {
+      font_object = font_for_char (face, ch, pos, string);
     }
 
   while (pos < *limit)

reply via email to

[Prev in Thread] Current Thread [Next in Thread]