bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#20140: 24.4; M17n shaper output rejected


From: Eli Zaretskii
Subject: bug#20140: 24.4; M17n shaper output rejected
Date: Mon, 14 Feb 2022 15:19:36 +0200

> Date: Sun, 13 Feb 2022 20:53:10 +0000
> From: Richard Wordingham <richard.wordingham@ntlworld.com>
> Cc: larsi@gnus.org, 20140@debbugs.gnu.org
> 
> On Sun, 13 Feb 2022 18:04:11 +0200
> Eli Zaretskii <eliz@gnu.org> wrote:
> 
> > But that didn't seem to work well enough: e.g., some marks in your
> > "sample text" didn't combine with letters, as I think they should.
> 
> Which ones?

Sorry, that was my faulty testing: I tested a half-baked change.  Your
rules do work correctly, AFAICT.

But I have 2 questions:

 1) Why do we need this part of the composition rules:

     (vector "." 0 'font-shape-gstring)

    This matches just one character, so what do we want to accomplish
    by this rule?  A single character cannot "self-compose", can it?

 2) Since tai-tham-composable-pattern always starts with what you
    denote as "C", how about setting up only entries of
    composition-function-table that correspond to those characters,
    i.e.:

     (let ((elt (list (vector tai-tham-composable-pattern 0 'font-shape-gstring)
                      )))
       (set-char-table-range composition-function-table '(#x1A20 . #x1A54) elt)
       (set-char-table-range composition-function-table '(#x1A80 . #x1A89) elt)
       (set-char-table-range composition-function-table '(#x1A90 . #x1A99) elt)
       (set-char-table-range composition-function-table '(#x1AA0 . #x1AAD) elt))

    Do you see any problems with that?

> I did suspect the problem was writing '\u1A7C' instead of
> '\u1a7c', but I'm no longer so sure.

No, that's not a problem.

> You should also add CGJ and ZWNJ, and some people may appreciate ZWJ -
> the Khottabun font has ligatures involving ZWJ, though it may just be
> an experimental feature - and ultimately WJ, for when someone writes a
> Tai Tham word breaker.

How should I add CGJ and ZWNJ?  What are the rules?

> Oh, and Thai and Lao mai t(r)i and mai chat(t)awa and U+0324
> COMBINING DIAERESIS BELOW turn up occasionally - U+0324 is supported
> in Thep's Khottabun font, and my Da Lekh series supports Thai mai
> tri and mai chattawa. These characters seem to work with HarfBuzz.

Not sure I understand: what patterns/rules should be added for these?

> If using the native Windows renderer is an option with Emacs, then 'A
> Tai Tham KH New' works better than 'A Tai Tham KH New V3'.

We still support Uniscribe, but prefer HarfBuzz, because MS deprecated
Uniscribe.  We cannot support DirectWrite, because its APIs are
C++-only, and no one has shown whether and how to call them from C.

> > Btw, is there a way to get all the examples from your
> > https://wrdingham.co.uk/lanna/renderer_test.htm as a UTF-8 encoded
> > text file?  I'd like to test the Emacs rendering with all of the
> > examples, but copy-pasting each example separately from the browser is
> > not my idea of useful time investment.  So if you could provide the
> > examples as a downloadable text file, I'd appreciate.
> 
> As buried (you're not the only one to have overlooked it) in the
> penultimate paragraph of 'Content and Layout' section, "The test words
> may, in principle, be extracted quite simply from this web page. Each
> test 'word' is the content of the first cell in each row whose class is
> tst1. For convenience*, I have extracted the first two cells in such
> rows, along with titles, to a CSV file."  The file is rt.csv in the
> same directory.

Thanks, I will use that.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]