bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#49066: 26.3; Segmentation fault on specific utf8 string


From: handa
Subject: bug#49066: 26.3; Segmentation fault on specific utf8 string
Date: Sat, 03 Jul 2021 11:05:05 +0900

In article <83bl7qp52q.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes:
> > With the patch it still crashes for me in emacs-master with harfbuzz 
> > disabled:

> Too bad.
> Kenichi, any suggestions?

I checked the code again, and found that it was a fault of m17n-lib
which was not robust enough to handle an OTF table that is different
from what the library expects.

Here is a revised patch to handle such a case.  Could you please try it?

------------------------------------------------------------
diff --git a/src/ftfont.c b/src/ftfont.c
index 0603dd9ce6..12d0d72d27 100644
--- a/src/ftfont.c
+++ b/src/ftfont.c
@@ -2798,10 +2798,31 @@ ftfont_shape_by_flt (Lisp_Object lgstring, struct font 
*font,
 
   if (gstring.used > LGSTRING_GLYPH_LEN (lgstring))
     return Qnil;
+
+  /* mflt_run may fail to set g->g.to (which must be a valid index
+     into lgstring) correctly if the font has an OTF table that is
+     different from what the m17n library expects. */
   for (i = 0; i < gstring.used; i++)
     {
       MFLTGlyphFT *g = (MFLTGlyphFT *) (gstring.glyphs) + i;
+      if (g->g.to >= len)
+       {
+         /* Invalid g->g.to. */
+         g->g.to = len - 1;
+         int from = g->g.from;
+         /* Fix remaining glyphs. */
+         for (++i; i < gstring.used; i++)
+           {
+             g = (MFLTGlyphFT *) (gstring.glyphs) + i;
+             g->g.from = from;
+             g->g.to = len - 1;
+           }
+       }
+    }
 
+  for (i = 0; i < gstring.used; i++)
+    {
+      MFLTGlyphFT *g = (MFLTGlyphFT *) (gstring.glyphs) + i;
       g->g.from = LGLYPH_FROM (LGSTRING_GLYPH (lgstring, g->g.from));
       g->g.to = LGLYPH_TO (LGSTRING_GLYPH (lgstring, g->g.to));
     }
------------------------------------------------------------

> Btw, I think there's a bug in those patterns: ZWJ and ZWNJ shouldn't
> compose unless they are followed by a character.  See section 12.2 in
> the Unicode Standard.

Even if they should not be composed with, we must include them in the
string to shape because their existence may change the glyph of the
previous character.  A shaper (m17n-lib or harfbuzz) must return a glyph
string that has an independent grapheme cluster for the last ZWJ/ZWNJ.

At the time of developing m17n-lib, the above rule was not clear.  To
conform to that rule, please to put the attached BNG2-OTF.flt under the
directory ~/.m17n.d/.

---
K. Handa
handa@gnu.org

Attachment: BNG2-OTF.flt
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]