bug-texinfo
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Encoding customization variable names


From: Gavin Smith
Subject: Re: Encoding customization variable names
Date: Sat, 23 Jul 2022 22:25:08 +0100

On Sat, Jul 23, 2022 at 06:19:45PM +0200, Patrice Dumas wrote:
> On Sat, Jul 23, 2022 at 04:25:02PM +0300, Eli Zaretskii wrote:
> > Are you saying that Unix systems with non-UTF-8 locales no longer
> > exist?  Because I'm familiar with several people whose locale on
> > GNU/Linux does not use UTF-8 as the codeset.
> 
> I am not saying that this does not exist anymore, but it is becoming
> very uncommon.  We got a bug report recently that a file system did
> not allow non UTF-8 compatible file names.  For me, this is without
> doubt the general direction.  I don't know which codeset those people
> use, but if it is 8bit codesets, my wild guess is that, in general, they
> avoid non ASCII filenames and, in general, know what to do if they
> encounter a situation with filenames in other codesets.  Which does not
> mean that we shouldn't correctly document the customization variables,
> it just means that issues should rarely occur.
> 
> > And what about users in the CJK world?
> 
> We only officially support UTF-8 in Texinfo as an encoding that contains
> CJK characters.  Maybe this is not right, but in the current situation,
> I am not sure that many issues could arise.

I agree, I am not sure what the problem is likely to be.  In theory
texi2any should work with any encoding supported by the system (EUC-JP?),
while output with TeX would need to use UTF-8, although texinfo.tex doesn't
support CJK characters with TeX - texinfo-ja.tex has to be used for
Japanese along with LuaTeX and XeTeX.  Probably output in Chinese or
languages in other scripts with TeX is not supported at all without
something else like texinfo-ja.tex being created.  texi2any --latex might
work okay, though.

Here's the current text in the manual:

‘DOC_ENCODING_FOR_INPUT_FILE_NAME’
     If set, use the input Texinfo document encoding information for the
     encoding of input file names, such as file names specified as
     ‘@include’ or ‘@verbatiminclude’ arguments.  If unset, use the
     locale encoding instead.  Default is set, except on MS-Windows
     where the locale encoding is used by default.

     Note that this is for file names only; ‘@documentencoding’ is
     always used for the encoding of file content (*note
     @documentencoding::).

     The ‘INPUT_FILE_NAME_ENCODING’ variable overrides this variable.

‘DOC_ENCODING_FOR_OUTPUT_FILE_NAME’
     If set, use the input Texinfo document encoding information for the
     encoding of output file names, such as files specified with
     ‘--output’.  If unset, use the locale encoding instead.  Default is
     unset, so files names are encoded using the current locale.

     Note that this is for file names only; ‘OUTPUT_ENCODING_NAME’ is
     used for the encoding of file content.

     The ‘OUTPUT_FILE_NAME_ENCODING’ variable overrides this variable.

...

‘INPUT_FILE_NAME_ENCODING’
     Encoding used for input file names.  This variable overrides any
     encoding from the document or current locale.  Normally, you do not
     need to set this variable, but it can be used if file names are in
     a certain character encoding on a filesystem.  An alternative is to
     set ‘DOC_ENCODING_FOR_INPUT_FILE_NAME’ to ‘0’ to use the locale
     encoding.  See also ‘OUTPUT_FILE_NAME_ENCODING’.

...

‘OUTPUT_FILE_NAME_ENCODING’
     Encoding used for output file names.  This variable overrides any
     encoding from the document or current locale.

     Normally, you do not need to set this variable, but it can be used
     if file names should be created in a certain character encoding on
     a filesystem.  See also ‘INPUT_FILE_NAME_ENCODING’.


Hopefully that covers most use cases.

> > But if we want to ignore those, I agree with you that the problem
> > largely doesn't exist.
> 
> It probably exist, and hopefully will exist more, if more people use
> Texinfo in non en languages/setups, but for now, it has been relatively
> small.  We only started having report on non ASCII characters in UTF-8
> locales very recently.
> 
> We will try to document things as well as possible, and consider all
> the suggestions, but I still bet that actual use cases of file names
> encoding requiring specific customizations will be very rare.
> 
> As for giving recommendations on practices related to these issues
> that's not something I can personally do, as I do not face those issues
> enough.

If people have use cases they find difficult to support, hopefully they
will report this so we can improve the documentation.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]