bug-texinfo
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: XeTeX encoding problem


From: Masamichi HOSODA
Subject: Re: XeTeX encoding problem
Date: Tue, 12 Jan 2016 01:22:41 +0900 (JST)

>> On the other hands, in XeTeX,
>> it seems that XeTeX does not have something like \XeTeXoutputencoding.
> 
> It appears not, from what I could find out.
> 
> For now, if you need to use XeTeX, you'd have to avoid any non-ASCII
> characters in anything written to an auxiliary file, e.g. use @"u
> instead of ü - that would be in section titles and index entries.
> 
> In theory, it should be possible to write out @U sequences to the
> auxiliary files wherever a non-ASCII character is used. I won't be
> working on this myself.
> 
>> At least in XeTeX, byte wise input is hard to use, isn't it?
>> To use XeTeX (and also maybe LuaTex) native Unicode support is better
>> than byte wise input in my humble opinion.
> 
> Maybe. I don't have anything to add to what's been said earlier in
> this discussion.

I've created a patch that uses native unicode support of both XeTeX and LuaTex.
It works fine in my XeTeX, LuaTeX and pdfTeX environment.
Except, LuaTeX create broken PDF bookmark.

How about this?
--- texinfo.tex.org     2016-01-09 09:38:07.812241700 +0900
+++ texinfo.tex 2016-01-12 01:10:58.012335400 +0900
@@ -1779,7 +1779,7 @@
 % #4 = \mainmagstep
 % #5 = OT1
 %
-\def\setfont#1#2#3#4#5{%
+\def\setfontdefault#1#2#3#4#5{%
   \font#1=\fontprefix#2#3 scaled #4
   \csname cmap#5\endcsname#1%
 }
@@ -1811,6 +1811,91 @@
 \def\scshape{csc}
 \def\scbshape{csc}
 
+% Native Unicode fonts settings for XeTeX and LuaTeX engine
+\newif\iftxiusenativeunicode
+\ifx\XeTeXrevision\thisisundefined
+  \ifx\luatexversion\thisisundefined
+    \txiusenativeunicodefalse
+  \else
+    \txiusenativeunicodetrue
+    \input luaotfload.sty
+  \fi
+\else
+  \txiusenativeunicodetrue
+\fi
+
+\iftxiusenativeunicode
+  \def\setfontunicode#1#2#3#4#5{%
+    \def\fontprefix{roman}
+    \def\fontsuffix{regular}
+    \edef\fontshape{#2}
+    \ifx\fontshape\rmshape % r
+      \def\fontprefix{roman}
+      \def\fontsuffix{regular}
+    \fi
+    \ifx\fontshape\rmbshape % bx
+      \def\fontprefix{roman}
+      \def\fontsuffix{bold}
+    \fi
+    \ifx\fontshape\bfshape % b
+      \def\fontprefix{romandemi}
+      \def\fontsuffix{regular}
+    \fi
+    \ifx\fontshape\bxshape % bx
+      \def\fontprefix{roman}
+      \def\fontsuffix{bold}
+    \fi
+    \ifx\fontshape\ttshape % tt
+      \def\fontprefix{mono}
+      \def\fontsuffix{regular}
+    \fi
+    \ifx\fontshape\ttbshape % tt
+      \def\fontprefix{mono}
+      \def\fontsuffix{regular}
+    \fi
+    \ifx\fontshape\ttslshape % sltt
+      \def\fontprefix{monoslant}
+      \def\fontsuffix{regular}
+    \fi
+    \ifx\fontshape\itshape % ti
+      \def\fontprefix{roman}
+      \def\fontsuffix{italic}
+    \fi
+    \ifx\fontshape\itbshape % bxti
+      \def\fontprefix{roman}
+      \def\fontsuffix{bolditalic}
+    \fi
+    \ifx\fontshape\slshape % sl
+      \def\fontprefix{romanslant}
+      \def\fontsuffix{regular}
+    \fi
+    \ifx\fontshape\slbshape % bxsl
+      \def\fontprefix{romanslant}
+      \def\fontsuffix{bold}
+    \fi
+    \ifx\fontshape\sfshape % ss
+      \def\fontprefix{sans}
+      \def\fontsuffix{regular}
+    \fi
+    \ifx\fontshape\sfbshape % ss
+      \def\fontprefix{sans}
+      \def\fontsuffix{regular}
+    \fi
+    \ifx\fontshape\scshape % csc
+      \def\fontprefix{romancaps}
+      \def\fontsuffix{regular}
+    \fi
+    \ifx\fontshape\scbshape %csc
+      \def\fontprefix{romancaps}
+      \def\fontsuffix{regular}
+    \fi
+    \font#1="[lm\fontprefix#3-\fontsuffix.otf]" scaled #4
+  }%
+  \let\setfont\setfontunicode
+\else
+  \let\setfont\setfontdefault
+\fi
+
 % Definitions for a main text size of 11pt.  (The default in Texinfo.)
 %
 \def\definetextfontsizexi{%
@@ -9428,32 +9513,6 @@
   \global\righthyphenmin = #3\relax
 }
 
-% Get input by bytes instead of by UTF-8 codepoints for XeTeX and LuaTeX, 
-% otherwise the encoding support is completely broken.
-\ifx\XeTeXrevision\thisisundefined
-\else
-\XeTeXdefaultencoding "bytes"  % For subsequent files to be read
-\XeTeXinputencoding "bytes"  % Effective in texinfo.tex only
-\fi
-
-\ifx\luatexversion\thisisundefined
-\else
-\directlua{
-local utf8_char, byte, gsub = unicode.utf8.char, string.byte, string.gsub
-
-local function convert_char (char)
-  return utf8_char(byte(char))
-end
-
-local function convert_line (line)
-  return gsub(line, ".", convert_char)
-end
-
-callback.register("process_input_buffer", convert_line)
-}
-\fi
-
-
 % Helpers for encodings.
 % Set the catcode of characters 128 through 255 to the specified number.
 %
@@ -9478,13 +9537,6 @@
 %
 \def\documentencoding{\parseargusing\filenamecatcodes\documentencodingzzz}
 \def\documentencodingzzz#1{%
-  % Get input by bytes instead of by UTF-8 codepoints for XeTeX,
-  % otherwise the encoding support is completely broken.
-  % This settings is for the document root file.
-  \ifx\XeTeXrevision\thisisundefined
-  \else
-    \XeTeXinputencoding "bytes"
-  \fi
   %
   % Encoding being declared for the document.
   \def\declaredencoding{\csname #1.enc\endcsname}%
@@ -9501,22 +9553,38 @@
      \asciichardefs
   %
   \else \ifx \declaredencoding \lattwo
-     \setnonasciicharscatcode\active
-     \lattwochardefs
+     \iftxiusenativeunicode
+       \message{TeX engine cannot use @documentencoding #1. Use UTF-8.}
+     \else
+       \setnonasciicharscatcode\active
+       \lattwochardefs
+     \fi
   %
   \else \ifx \declaredencoding \latone
-     \setnonasciicharscatcode\active
-     \latonechardefs
+     \iftxiusenativeunicode
+       \message{TeX engine cannot use @documentencoding #1. Use UTF-8.}
+     \else
+       \setnonasciicharscatcode\active
+       \latonechardefs
+     \fi
   %
   \else \ifx \declaredencoding \latnine
-     \setnonasciicharscatcode\active
-     \latninechardefs
+     \iftxiusenativeunicode
+       \message{TeX engine cannot use @documentencoding #1. Use UTF-8.}
+     \else
+       \setnonasciicharscatcode\active
+       \latninechardefs
+     \fi
   %
   \else \ifx \declaredencoding \utfeight
-     \setnonasciicharscatcode\active
-     % since we already invoked \utfeightchardefs at the top level
-     % (below), do not re-invoke it, then our check for duplicated
-     % definitions triggers.  Making non-ascii chars active is enough.
+     \iftxiusenativeunicode
+       \setnonasciicharscatcode\other
+     \else
+       \setnonasciicharscatcode\active
+       % since we already invoked \utfeightchardefs at the top level
+       % (below), do not re-invoke it, then our check for duplicated
+       % definitions triggers.  Making non-ascii chars active is enough.
+     \fi
   %
   \else
     \message{Ignoring unknown document encoding: #1.}%
@@ -10639,6 +10707,10 @@
   \defstringchar^^f4\defstringchar^^f5\defstringchar^^f6\defstringchar^^f7%
   \defstringchar^^f8\defstringchar^^f9\defstringchar^^fa\defstringchar^^fb%
   \defstringchar^^fc\defstringchar^^fd\defstringchar^^fe\defstringchar^^ff%
+
+  \iftxiusenativeunicode
+    \setnonasciicharscatcode\other
+  \fi
 }
 
 
\input texinfo.tex

@documentencoding UTF-8

@contents

@chapter für

für

@bye

reply via email to

[Prev in Thread] Current Thread [Next in Thread]