bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#54562: 28.0.91; Emoji sequence not composed


From: Robert Pluim
Subject: bug#54562: 28.0.91; Emoji sequence not composed
Date: Tue, 29 Mar 2022 16:50:10 +0200

>>>>> On Tue, 29 Mar 2022 14:44:47 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Robert Pluim <rpluim@gmail.com>
    >> Cc: luangruo@yahoo.com,  larsi@gnus.org,  54562@debbugs.gnu.org
    >> Date: Tue, 29 Mar 2022 12:45:44 +0200
    >> 
    Eli> I thought about any Mn character whose canonical-combining-class
    Eli> property is 200 and above.  The COMBINING ENCLOSING <SOMETHING> stuff
    Eli> will need to be added to that, of course.  And we could have that
    Eli> option have multiple possible values, not just on/off...
    >> 
    >> OK. Would Me be ok for you, or would you specifically want only the
    >> codepoints from the "Combining Diacritical Marks for Symbols" block?

    Eli> Using Me is fine with me.

OK. There are probably subtleties surrounding things like U+20D2 that
I need to read up on (or we say "overlays are deprecated, letʼs ignore
them").

    >> I guess you'd want options like:
    >> 
    >> 'all => combining-class + enclosing
    >> 'enclosing
    >> 'combining-class
    >> 
    >> (did we want to cover the 'number followed U+20E3 => emoji' case with
    >> an option too?)

    Eli> That's a separate issue, IMO, and it can be handled via
    Eli> auto-composition-emoji-eligible-codepoints, I think?  We could even
    Eli> tell users to do that by themselves.

We could, although my purist side doesnʼt want to do it, since the
standard exists for a reason, dammit.

    Eli> We could perhaps avoid the complexity by rewriting the composition
    Eli> rule for diacritics.  Instead of "\\c.\\c^+" with 1-character
    Eli> look-back, we could have several rules:

    Eli>    "\\c.\\c^\\c^\\c^\\c^" with 4-character look-back
    Eli>    "\\c.\\c^\\c^\\c^+"    with 3-character look-back
    Eli>    "\\c.\\c^\\c^+"        with 2-character look-back
    Eli>    "\\c.\\c^+"            with 1-character look-back

    Eli> (in that order).  I didn't test this, but if it works, maybe it could
    Eli> solve the problem without any deep changes on the C level.

That might work. What would the fallback look like? Suppose we have 4
diacritics, 3 of which are covered by the same font, and one by a
different one. Would you prefer to attempt to use the font of 3 of
them, or would you prefer to fall back to the font of the base
character? (Iʼm not sure which would give better results in practice,
they might both fail)

Robert
-- 





reply via email to

[Prev in Thread] Current Thread [Next in Thread]