[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: `texindex` output depends on locale settings
From: |
Werner LEMBERG |
Subject: |
Re: `texindex` output depends on locale settings |
Date: |
Sun, 06 Nov 2022 16:09:59 +0000 (UTC) |
>> > > > I consider it very bad that `texindex` is locale-dependent.
>> > > > IMHO the proper solution is to make `texinfo.tex` emit a
>> > > > document encoding statement to the (unsorted) index file
>> > > > that in turn gets acknowledged by `texindex`.
>>
>> Sure? No. But I have some thoughts.
>>
>> > FWIW, I don't even understand how can this be accomplished,
>> > unless the program reinvents all the library functions that deal
>> > with characters from scratch, instead of using libc functions
>> > (which are locale-dependent). And Gawk does use libc functions
>> > for that.
>>
>> The current islower() is
>>
>> function islower(c)
>> {
>> return index("abcdefghijklmnopqrstuvwxyz", c) > 0
>> }
>>
>> It could instead be
>>
>> function islower(c)
>> {
>> return c ~ /[[:lower:]]/
>> }
>>
>> And similar for the others. That would work for any unicode
>> character.
>
> Sure, but is the issue only with lower-case letters? What about
> collation order or even determining what is and isn't a character
> (as opposed to incomplete byte sequence)?
Two remarks.
* I think it would be OK if the documentation says that i18n support
for sorting only works with awk programs that understand `LANG`.
* Let's assume that GNU awk behaves similar to, say, GNU sort. The
collation order and input encoding gets controlled with `LANG` –
looking into the awk info manual this seems like a reasonable
assumption.
As far as I can see, my two issues could be resolved by a shell
wrapper around the awk program that analyzes the (yet to be added)
`@documentencoding` and `@documentlanguage` settings in an unsorted
index file. From those two settings it synthesizes a proper `LANG`
argument that gets passed to GNU awk, et voilà.
Am I missing something?
Werner
- `texindex` output depends on locale settings, Werner LEMBERG, 2022/11/06
- Re: `texindex` output depends on locale settings, arnold, 2022/11/06
- Re: `texindex` output depends on locale settings, Eli Zaretskii, 2022/11/06
- Re: `texindex` output depends on locale settings,
Werner LEMBERG <=
- Re: `texindex` output depends on locale settings, Eli Zaretskii, 2022/11/06
- Re: `texindex` output depends on locale settings, Werner LEMBERG, 2022/11/06
- Re: `texindex` output depends on locale settings, Eli Zaretskii, 2022/11/06
- Re: `texindex` output depends on locale settings, arnold, 2022/11/06
- Re: `texindex` output depends on locale settings, Eli Zaretskii, 2022/11/06
- Re: `texindex` output depends on locale settings, arnold, 2022/11/06
- Re: `texindex` output depends on locale settings, Werner LEMBERG, 2022/11/06
- Re: `texindex` output depends on locale settings, Patrice Dumas, 2022/11/06
- Re: `texindex` output depends on locale settings, Gavin Smith, 2022/11/06
Re: `texindex` output depends on locale settings, Gavin Smith, 2022/11/06