[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: XeTeX encoding problem
From: |
Masamichi HOSODA |
Subject: |
Re: XeTeX encoding problem |
Date: |
Tue, 12 Jan 2016 01:22:41 +0900 (JST) |
>> On the other hands, in XeTeX,
>> it seems that XeTeX does not have something like \XeTeXoutputencoding.
>
> It appears not, from what I could find out.
>
> For now, if you need to use XeTeX, you'd have to avoid any non-ASCII
> characters in anything written to an auxiliary file, e.g. use @"u
> instead of ü - that would be in section titles and index entries.
>
> In theory, it should be possible to write out @U sequences to the
> auxiliary files wherever a non-ASCII character is used. I won't be
> working on this myself.
>
>> At least in XeTeX, byte wise input is hard to use, isn't it?
>> To use XeTeX (and also maybe LuaTex) native Unicode support is better
>> than byte wise input in my humble opinion.
>
> Maybe. I don't have anything to add to what's been said earlier in
> this discussion.
I've created a patch that uses native unicode support of both XeTeX and LuaTex.
It works fine in my XeTeX, LuaTeX and pdfTeX environment.
Except, LuaTeX create broken PDF bookmark.
How about this?
--- texinfo.tex.org 2016-01-09 09:38:07.812241700 +0900
+++ texinfo.tex 2016-01-12 01:10:58.012335400 +0900
@@ -1779,7 +1779,7 @@
% #4 = \mainmagstep
% #5 = OT1
%
-\def\setfont#1#2#3#4#5{%
+\def\setfontdefault#1#2#3#4#5{%
\font#1=\fontprefix#2#3 scaled #4
\csname cmap#5\endcsname#1%
}
@@ -1811,6 +1811,91 @@
\def\scshape{csc}
\def\scbshape{csc}
+% Native Unicode fonts settings for XeTeX and LuaTeX engine
+\newif\iftxiusenativeunicode
+\ifx\XeTeXrevision\thisisundefined
+ \ifx\luatexversion\thisisundefined
+ \txiusenativeunicodefalse
+ \else
+ \txiusenativeunicodetrue
+ \input luaotfload.sty
+ \fi
+\else
+ \txiusenativeunicodetrue
+\fi
+
+\iftxiusenativeunicode
+ \def\setfontunicode#1#2#3#4#5{%
+ \def\fontprefix{roman}
+ \def\fontsuffix{regular}
+ \edef\fontshape{#2}
+ \ifx\fontshape\rmshape % r
+ \def\fontprefix{roman}
+ \def\fontsuffix{regular}
+ \fi
+ \ifx\fontshape\rmbshape % bx
+ \def\fontprefix{roman}
+ \def\fontsuffix{bold}
+ \fi
+ \ifx\fontshape\bfshape % b
+ \def\fontprefix{romandemi}
+ \def\fontsuffix{regular}
+ \fi
+ \ifx\fontshape\bxshape % bx
+ \def\fontprefix{roman}
+ \def\fontsuffix{bold}
+ \fi
+ \ifx\fontshape\ttshape % tt
+ \def\fontprefix{mono}
+ \def\fontsuffix{regular}
+ \fi
+ \ifx\fontshape\ttbshape % tt
+ \def\fontprefix{mono}
+ \def\fontsuffix{regular}
+ \fi
+ \ifx\fontshape\ttslshape % sltt
+ \def\fontprefix{monoslant}
+ \def\fontsuffix{regular}
+ \fi
+ \ifx\fontshape\itshape % ti
+ \def\fontprefix{roman}
+ \def\fontsuffix{italic}
+ \fi
+ \ifx\fontshape\itbshape % bxti
+ \def\fontprefix{roman}
+ \def\fontsuffix{bolditalic}
+ \fi
+ \ifx\fontshape\slshape % sl
+ \def\fontprefix{romanslant}
+ \def\fontsuffix{regular}
+ \fi
+ \ifx\fontshape\slbshape % bxsl
+ \def\fontprefix{romanslant}
+ \def\fontsuffix{bold}
+ \fi
+ \ifx\fontshape\sfshape % ss
+ \def\fontprefix{sans}
+ \def\fontsuffix{regular}
+ \fi
+ \ifx\fontshape\sfbshape % ss
+ \def\fontprefix{sans}
+ \def\fontsuffix{regular}
+ \fi
+ \ifx\fontshape\scshape % csc
+ \def\fontprefix{romancaps}
+ \def\fontsuffix{regular}
+ \fi
+ \ifx\fontshape\scbshape %csc
+ \def\fontprefix{romancaps}
+ \def\fontsuffix{regular}
+ \fi
+ \font#1="[lm\fontprefix#3-\fontsuffix.otf]" scaled #4
+ }%
+ \let\setfont\setfontunicode
+\else
+ \let\setfont\setfontdefault
+\fi
+
% Definitions for a main text size of 11pt. (The default in Texinfo.)
%
\def\definetextfontsizexi{%
@@ -9428,32 +9513,6 @@
\global\righthyphenmin = #3\relax
}
-% Get input by bytes instead of by UTF-8 codepoints for XeTeX and LuaTeX,
-% otherwise the encoding support is completely broken.
-\ifx\XeTeXrevision\thisisundefined
-\else
-\XeTeXdefaultencoding "bytes" % For subsequent files to be read
-\XeTeXinputencoding "bytes" % Effective in texinfo.tex only
-\fi
-
-\ifx\luatexversion\thisisundefined
-\else
-\directlua{
-local utf8_char, byte, gsub = unicode.utf8.char, string.byte, string.gsub
-
-local function convert_char (char)
- return utf8_char(byte(char))
-end
-
-local function convert_line (line)
- return gsub(line, ".", convert_char)
-end
-
-callback.register("process_input_buffer", convert_line)
-}
-\fi
-
-
% Helpers for encodings.
% Set the catcode of characters 128 through 255 to the specified number.
%
@@ -9478,13 +9537,6 @@
%
\def\documentencoding{\parseargusing\filenamecatcodes\documentencodingzzz}
\def\documentencodingzzz#1{%
- % Get input by bytes instead of by UTF-8 codepoints for XeTeX,
- % otherwise the encoding support is completely broken.
- % This settings is for the document root file.
- \ifx\XeTeXrevision\thisisundefined
- \else
- \XeTeXinputencoding "bytes"
- \fi
%
% Encoding being declared for the document.
\def\declaredencoding{\csname #1.enc\endcsname}%
@@ -9501,22 +9553,38 @@
\asciichardefs
%
\else \ifx \declaredencoding \lattwo
- \setnonasciicharscatcode\active
- \lattwochardefs
+ \iftxiusenativeunicode
+ \message{TeX engine cannot use @documentencoding #1. Use UTF-8.}
+ \else
+ \setnonasciicharscatcode\active
+ \lattwochardefs
+ \fi
%
\else \ifx \declaredencoding \latone
- \setnonasciicharscatcode\active
- \latonechardefs
+ \iftxiusenativeunicode
+ \message{TeX engine cannot use @documentencoding #1. Use UTF-8.}
+ \else
+ \setnonasciicharscatcode\active
+ \latonechardefs
+ \fi
%
\else \ifx \declaredencoding \latnine
- \setnonasciicharscatcode\active
- \latninechardefs
+ \iftxiusenativeunicode
+ \message{TeX engine cannot use @documentencoding #1. Use UTF-8.}
+ \else
+ \setnonasciicharscatcode\active
+ \latninechardefs
+ \fi
%
\else \ifx \declaredencoding \utfeight
- \setnonasciicharscatcode\active
- % since we already invoked \utfeightchardefs at the top level
- % (below), do not re-invoke it, then our check for duplicated
- % definitions triggers. Making non-ascii chars active is enough.
+ \iftxiusenativeunicode
+ \setnonasciicharscatcode\other
+ \else
+ \setnonasciicharscatcode\active
+ % since we already invoked \utfeightchardefs at the top level
+ % (below), do not re-invoke it, then our check for duplicated
+ % definitions triggers. Making non-ascii chars active is enough.
+ \fi
%
\else
\message{Ignoring unknown document encoding: #1.}%
@@ -10639,6 +10707,10 @@
\defstringchar^^f4\defstringchar^^f5\defstringchar^^f6\defstringchar^^f7%
\defstringchar^^f8\defstringchar^^f9\defstringchar^^fa\defstringchar^^fb%
\defstringchar^^fc\defstringchar^^fd\defstringchar^^fe\defstringchar^^ff%
+
+ \iftxiusenativeunicode
+ \setnonasciicharscatcode\other
+ \fi
}
\input texinfo.tex
@documentencoding UTF-8
@contents
@chapter für
für
@bye
- Re: luatex problems with texinfo.tex, (continued)
- XeTeX encoding problem (was Re: luatex problems with texinfo.tex), Masamichi Hosoda, 2016/01/06
- Re: XeTeX encoding problem, Masamichi HOSODA, 2016/01/10
- Re: XeTeX encoding problem, Gavin Smith, 2016/01/10
- Re: XeTeX encoding problem, Gavin Smith, 2016/01/10
- Re: XeTeX encoding problem, Masamichi HOSODA, 2016/01/10
- Re: XeTeX encoding problem, Gavin Smith, 2016/01/11
- Re: XeTeX encoding problem,
Masamichi HOSODA <=
- Re: XeTeX encoding problem, Gavin Smith, 2016/01/11
- Re: XeTeX encoding problem, Masamichi HOSODA, 2016/01/13
- Re: XeTeX encoding problem, Karl Berry, 2016/01/14
- Re: XeTeX encoding problem, Gavin Smith, 2016/01/15
- Re: XeTeX encoding problem, Masamichi HOSODA, 2016/01/15
- Re: XeTeX encoding problem, Gavin Smith, 2016/01/15
- Re: XeTeX encoding problem, Masamichi HOSODA, 2016/01/15
- Re: XeTeX encoding problem, Gavin Smith, 2016/01/15
- Re: XeTeX encoding problem, Masamichi HOSODA, 2016/01/15
- Re: XeTeX encoding problem, Gavin Smith, 2016/01/15