[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Encoding customization variable names
From: |
Gavin Smith |
Subject: |
Re: Encoding customization variable names |
Date: |
Sat, 23 Jul 2022 22:25:08 +0100 |
On Sat, Jul 23, 2022 at 06:19:45PM +0200, Patrice Dumas wrote:
> On Sat, Jul 23, 2022 at 04:25:02PM +0300, Eli Zaretskii wrote:
> > Are you saying that Unix systems with non-UTF-8 locales no longer
> > exist? Because I'm familiar with several people whose locale on
> > GNU/Linux does not use UTF-8 as the codeset.
>
> I am not saying that this does not exist anymore, but it is becoming
> very uncommon. We got a bug report recently that a file system did
> not allow non UTF-8 compatible file names. For me, this is without
> doubt the general direction. I don't know which codeset those people
> use, but if it is 8bit codesets, my wild guess is that, in general, they
> avoid non ASCII filenames and, in general, know what to do if they
> encounter a situation with filenames in other codesets. Which does not
> mean that we shouldn't correctly document the customization variables,
> it just means that issues should rarely occur.
>
> > And what about users in the CJK world?
>
> We only officially support UTF-8 in Texinfo as an encoding that contains
> CJK characters. Maybe this is not right, but in the current situation,
> I am not sure that many issues could arise.
I agree, I am not sure what the problem is likely to be. In theory
texi2any should work with any encoding supported by the system (EUC-JP?),
while output with TeX would need to use UTF-8, although texinfo.tex doesn't
support CJK characters with TeX - texinfo-ja.tex has to be used for
Japanese along with LuaTeX and XeTeX. Probably output in Chinese or
languages in other scripts with TeX is not supported at all without
something else like texinfo-ja.tex being created. texi2any --latex might
work okay, though.
Here's the current text in the manual:
‘DOC_ENCODING_FOR_INPUT_FILE_NAME’
If set, use the input Texinfo document encoding information for the
encoding of input file names, such as file names specified as
‘@include’ or ‘@verbatiminclude’ arguments. If unset, use the
locale encoding instead. Default is set, except on MS-Windows
where the locale encoding is used by default.
Note that this is for file names only; ‘@documentencoding’ is
always used for the encoding of file content (*note
@documentencoding::).
The ‘INPUT_FILE_NAME_ENCODING’ variable overrides this variable.
‘DOC_ENCODING_FOR_OUTPUT_FILE_NAME’
If set, use the input Texinfo document encoding information for the
encoding of output file names, such as files specified with
‘--output’. If unset, use the locale encoding instead. Default is
unset, so files names are encoded using the current locale.
Note that this is for file names only; ‘OUTPUT_ENCODING_NAME’ is
used for the encoding of file content.
The ‘OUTPUT_FILE_NAME_ENCODING’ variable overrides this variable.
...
‘INPUT_FILE_NAME_ENCODING’
Encoding used for input file names. This variable overrides any
encoding from the document or current locale. Normally, you do not
need to set this variable, but it can be used if file names are in
a certain character encoding on a filesystem. An alternative is to
set ‘DOC_ENCODING_FOR_INPUT_FILE_NAME’ to ‘0’ to use the locale
encoding. See also ‘OUTPUT_FILE_NAME_ENCODING’.
...
‘OUTPUT_FILE_NAME_ENCODING’
Encoding used for output file names. This variable overrides any
encoding from the document or current locale.
Normally, you do not need to set this variable, but it can be used
if file names should be created in a certain character encoding on
a filesystem. See also ‘INPUT_FILE_NAME_ENCODING’.
Hopefully that covers most use cases.
> > But if we want to ignore those, I agree with you that the problem
> > largely doesn't exist.
>
> It probably exist, and hopefully will exist more, if more people use
> Texinfo in non en languages/setups, but for now, it has been relatively
> small. We only started having report on non ASCII characters in UTF-8
> locales very recently.
>
> We will try to document things as well as possible, and consider all
> the suggestions, but I still bet that actual use cases of file names
> encoding requiring specific customizations will be very rare.
>
> As for giving recommendations on practices related to these issues
> that's not something I can personally do, as I do not face those issues
> enough.
If people have use cases they find difficult to support, hopefully they
will report this so we can improve the documentation.
- Re: Encoding customization variable names, Gavin Smith, 2022/07/22
- Re: Encoding customization variable names, Gavin Smith, 2022/07/22
- Re: Encoding customization variable names, Patrice Dumas, 2022/07/22
- Re: Encoding customization variable names, Gavin Smith, 2022/07/22
- Re: Encoding customization variable names, Eli Zaretskii, 2022/07/23
- Re: Encoding customization variable names, Patrice Dumas, 2022/07/23
- Re: Encoding customization variable names, Eli Zaretskii, 2022/07/23
- Re: Encoding customization variable names, Patrice Dumas, 2022/07/23
- Re: Encoding customization variable names,
Gavin Smith <=
- Re: Encoding customization variable names, Eli Zaretskii, 2022/07/24