bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#41005: problem with rendering Persian text in Emacs 27


From: Eli Zaretskii
Subject: bug#41005: problem with rendering Persian text in Emacs 27
Date: Sat, 06 Jun 2020 12:04:04 +0300

> From: Pip Cet <pipcet@gmail.com>
> Cc: valizadeh.ho@gmail.com,  41005@debbugs.gnu.org,  nicholasdrozd@gmail.com
> Date: Sat, 06 Jun 2020 08:38:39 +0000
> 
> >> Given these two bugs, I wonder whether it wouldn't be more reasonable
> >> always to let HarfBuzz guess the direction, at least for Emacs-27:
> >> scripts which change direction, if they are supported by HarfBuzz, won't
> >> work anyway.
> >
> > Please explain "scripts that change direction" and "won't work
> > anyway", I don't think I understand that part.
> 
> I think your example (RLO..PDF in RTL text) is better: that won't work
> anyway, right now, because if, for example, you type
> 
> <HEBREW LETTER SHIN> <RIGHT-TO-LEFT OVERRIDE> f i
> 
> and have set the char table to treat "fi" as a ligature, the result will
> (at least sometimes) be an "fi" ligature, but it should look like the
> word "if".

That's not how shaping engines work, at least not how HarfBuzz does
AFAIU.  It gets the characters in the logical order, so it always
wants to see "fi", even if the directionality of the characters was
overridden, and it also wants to know the local text directionality.
What is produced from that depends on the font: if it has different
ligatures for "fi" in different directions, then HarfBuzz should give
us back the ligature appropriate for the direction it was passed.

(Personally, I think that when some text uses a directional override,
they don't intend to see ligatures, because the override is mostly for
treating characters as independent of the surrounding context.  But
this is eventually up to the font to specify.  AFAIU, Arabic shaping
works differently in different directional contexts, for example.)

> > The reason we don't let HarfBuzz guess in all cases is because the
> > resolved bidi level, when we have it, is a more accurate indication of
> > the required direction.
> 
> Yes, but we'll still cache the wrong direction.

Why "wrong"?  We will cache the same direction as we passed to
HarfBuzz, and thus the produced glyphs will be consistent with the
cached direction.  And if we ever need to display the same sequence of
characters with a different direction, the cached sequence will fail
to match, and we will call HarfBuzz again to produce glyphs for this
other direction.  That sounds TRT to me.

> If we let HarfBuzz guess in all cases, output will be consistent and
> usually correct

We want the direction to be _always_ correct, not just "usually".  The
shapers we used before HarfBuzz didn't allow to pass the direction,
they always guessed it.  HarfBuzz lets us specify the direction, which
is progress, since Emacs now has better control on the glyphs that are
produced, and HarfBuzz developers tell us the difference sometimes
matters.

> > For example, if you have RTL characters
> > inside the LRO..PDF embedding, it would be wrong to let the shaper
> > guess, because it could (and usually will) guess wrongly that the
> > direction is R2L.  It is true that these are rare and unusual use
> > cases, but they do exist, and Emacs does want to support them,
> > including with scripts that must use the shaping engine.
> 
> As I described, I don't think RLO..PDF works with shaping right now,
> because other code might have already cached the non-overridden glyph
> string.

I was saying that under the assumption that the direction will be
cached.  You are right that currently this doesn't work correctly, but
that's exactly why we agreed to cache the direction with the other
composition information.  Once the caching of direction is
implemented, my point is that passing the direction to HarfBuzz and
caching it will produce better results for text in a directional
override than if we let HarfBuzz guess the direction.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]