bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#20140: 24.4; M17n shaper output rejected


From: Richard Wordingham
Subject: bug#20140: 24.4; M17n shaper output rejected
Date: Mon, 14 Feb 2022 23:26:23 +0000

On Mon, 14 Feb 2022 15:26:07 +0200
Eli Zaretskii <eliz@gnu.org> wrote:

> > Date: Sun, 13 Feb 2022 21:11:52 +0000
> > From: Richard Wordingham <richard.wordingham@ntlworld.com>
> > Cc: larsi@gnus.org, 20140@debbugs.gnu.org

> No, that's not true.  I'm not aware of any such limitation; AFAIK
> Arabic shaping works correctly in Emacs, certainly with HarfBuzz and
> Emacs 27 or later.
> 
> Or maybe I misunderstand what you mean by "typewriter-like" fonts?
> Can you give an example of a non-typewriter-like font for Arabic that
> I can find on MS-Windows and try?

Not off the top of my head, but compare لحج with the presentation form
‎ﳊ U+FCCA ARABIC LIGATURE LAM WITH HAH INITIAL FORM for the first two
letters.  The lam part is a vertical line in the middle of the glyph;
the 'hah' part forms the lower part of the glyph.

> > There would be a similar problem with the use of Tai Khuen or other
> > tunnelling fonts for Northern Thai if you used the current mechanism
> > for advancing character by character.  Tunnelling fonts write parts
> > of one cluster under the next.  The Tai Khuen fonts I've seen do
> > this by relying on characteristics of Tai Khuen spelling.  The
> > rules don't hold for Northern Thai, and consequently the subscript
> > portions of successive orthographic syllables can overwrite one
> > another.  A sophisticated font could check for clashes, but that
> > needs the orthographic syllables to be passed to the shaper
> > together.  
> 
> I'm not sure I understand.  Does HarfBuzz know about these advancement
> features?  We rely on HarfBuzz to give us back as many grapheme
> clusters as it sees fit for a given chunk of text, and we expect each
> grapheme cluster to include glyphs with relative offsets as needed by
> the script and the font.

No, the fonts rely on the grammar of Tai Khuen.  If an orthographic
syllable contains U+1A6C TAI THAM VOWEL SIGN OA BELOW, there will be a
following orthographic syllable in the same phonetic syllable, and
it will consist of a single consonant with no tail and possible some
marks above.  The font designers therefore do not worry about the
effect on the advance width; there will be room for U+1A6C below the
next orthographic syllable.  If you want to see details now, enter
ᩉ᩠ᨾᩬᩁ ᩉ᩠ᨾᩳᨶᩥ᩠ᨯ ᩉ᩠ᨾᩬᩴᨶᩥ᩠ᨯ in the 'Play Area' text box of
https://wrdingham.co.uk/lanna/renderer_test.htm.  The first word is
spelt the same in Northern Thai and Tai Khuen.  As you switch the font
from Lamphun to A Tai Tham KH (with ccmp enabled if you are using IE
11), the glyphs at the bottom of the word spread out to use the
available space.  The next two words are 'Dr Nit' written in Tai Khuen
and Northern Thai.  The word for 'Dr', /mɔː/, is spelt quite
differently in the two languages, though the consonants are the same.
Both have a vowel above, but the Northern Thai also has U+1A6C below,
as in the first word. When A Tai Tham KH is selected as the font, it
clashes badly with the bottom of the second syllable, 'Nit'. 

This phenomenon of a vowel below expanding below the next consonant
also occurs in Northern Thai, but I don't know of any Northern Thai
font that is clever enough to do this, because checking for space below
the next consonant is fiddly.

> IOW, this job is delegated to the shaping engine, such as HarfBuzz;
> Emacs just takes the glyphs and offsets HarfBuzz gives us and blindly
> obeys them.

The problem is that font writers tend to make assumptions about the
language their font will be used for.  The second is that with a good
tunnelling font, HarfBuzz needs to know what comes in the next
syllable.  At present, using a tunnelling font for Tai Tham risks
clashes when used with Emacs.  The Tai Khuen fonts look good, but are
not suitable for writing Northern Thai.

Richard.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]