bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#54562: 28.0.91; Emoji sequence not composed


From: Eli Zaretskii
Subject: bug#54562: 28.0.91; Emoji sequence not composed
Date: Tue, 29 Mar 2022 14:44:47 +0300

> From: Robert Pluim <rpluim@gmail.com>
> Cc: luangruo@yahoo.com,  larsi@gnus.org,  54562@debbugs.gnu.org
> Date: Tue, 29 Mar 2022 12:45:44 +0200
> 
>     Eli> I thought about any Mn character whose canonical-combining-class
>     Eli> property is 200 and above.  The COMBINING ENCLOSING <SOMETHING> stuff
>     Eli> will need to be added to that, of course.  And we could have that
>     Eli> option have multiple possible values, not just on/off...
> 
> OK. Would Me be ok for you, or would you specifically want only the
> codepoints from the "Combining Diacritical Marks for Symbols" block?

Using Me is fine with me.

> I guess you'd want options like:
> 
> 'all => combining-class + enclosing
> 'enclosing
> 'combining-class
> 
> (did we want to cover the 'number followed U+20E3 => emoji' case with
> an option too?)

That's a separate issue, IMO, and it can be handled via
auto-composition-emoji-eligible-codepoints, I think?  We could even
tell users to do that by themselves.

> 
>     Eli> Btw, for sequences that include a base character and 2 or more
>     Eli> diacritics, selecting a font that supports the first diacritic (the
>     Eli> one which triggers the composition) might not be enough, since the
>     Eli> rest of the diacritics could be absent from that font.  Instead, we'd
>     Eli> need something like "find the font for each one of them and then use
>     Eli> the one which supports the largest subset of them".
> 
> font_range currently only has access to the first diacritic, so that
> would be a bigger change. And that subset had better have the same
> size as the number of unique diacritics, otherwise itʼs unlikely to
> work.

We could perhaps avoid the complexity by rewriting the composition
rule for diacritics.  Instead of "\\c.\\c^+" with 1-character
look-back, we could have several rules:

   "\\c.\\c^\\c^\\c^\\c^" with 4-character look-back
   "\\c.\\c^\\c^\\c^+"    with 3-character look-back
   "\\c.\\c^\\c^+"        with 2-character look-back
   "\\c.\\c^+"            with 1-character look-back

(in that order).  I didn't test this, but if it works, maybe it could
solve the problem without any deep changes on the C level.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]