bug-texinfo
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

`texindex` output depends on locale settings


From: Werner LEMBERG
Subject: `texindex` output depends on locale settings
Date: Sun, 06 Nov 2022 10:02:44 +0000 (UTC)

[texindex (GNU texinfo) 6.8dev]
[GNU Awk 4.2.1, API: 2.0]
[openSUSE Leap 15.4]


There are two bugs with texindex, making it basically unusable for
everything except English as the main document language.  For the
report below, here is an input file.

```
\input texinfo.tex

@documentencoding UTF-8
@documentlanguage ca

@findex a
@findex à
@findex u
@findex ù

@printindex fn

@bye
```

* The first, really severe bug is that the resulting output is
  completely broken if `texindex` is called with `LANG=C`.  Saying

  ```
  LANG=C texi2pdf sort-ca.texi 
  ```

  creates the following `.fns` output

  ```
  \initial {0xc3}
  \entry{\code {à}}{1}
  \entry{\code {ù}}{1}
  \initial {A}
  \entry{\code {a}}{1}
  \initial {U}
  \entry{\code {u}}{1}
  ```

  As can be seen, the `\initial` line contains a single byte (where
  '0xc3' is a real byte), which suprisingly doesn't make pdftex abort,
  but both xetex and luatex stop with errors.  I have to use a UTF-8
  locale like `en_US.utf8` to get decent output.

  I consider it very bad that `texindex` is locale-dependent.  IMHO
  the proper solution is to make `texinfo.tex` emit a document
  encoding statement to the (unsorted) index file that in turn gets
  acknowledged by `texindex`.

* While `texindex` is sensitive to the locale regarding the input
  encoding, it isn't for collation: any `LANG` or `LC_COLLATE` setting
  gets ignored.  Similarly, it ignores the `@documentlanguage`
  instruction to derive a sorting order.  For example, the Catalan
  order for the above example should be 'aàuù', however, in the output
  it is sorted as `àùau'.

  The proper fix would be to make `texinfo.tex` emit a document
  language statement to the (unsorted) index file that in turn gets
  acknowledged by `texindex`.


     Werner

reply via email to

[Prev in Thread] Current Thread [Next in Thread]