bug-texinfo
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Skip filename recoding tests on MS-Windows


From: pertusus
Subject: Re: Skip filename recoding tests on MS-Windows
Date: Tue, 25 Oct 2022 21:49:04 +0200

On Tue, Oct 25, 2022 at 10:32:31PM +0300, Eli Zaretskii wrote:
> > Date: Tue, 25 Oct 2022 21:18:35 +0200
> > From: pertusus@free.fr
> > Cc: GavinSmith0123@gmail.com, bug-texinfo@gnu.org
> > 
> > > It should work if the document is indeed in the expected encoding.
> > > But if the file is actually encoded in something other, especially if
> > > the encoding is multibyte (like UTF-8), it will not work.
> > 
> > Indeed, it is not reliable, but what would be the best default?  It
> > seems to me that Windows adds additional possibilities for anything to
> > fail.  However, on the issue of using the codepage to encode file names
> > in texi2any versus using the input file encoding, it does not seems to
> > me that Windows is special.  If we use the input file encoding on other
> > platforms, assuming that the use case is converting manuals from
> > archives where all the files are similarily encoded, consistently with
> > manuals, it seems to me that Windows is not very special.  It will
> > fail in some cases on Windows, but using the user codepage will decrease
> > even more the possibility that the result is correct (files with encoded
> > characters in their names are found).  Are you still sure that using
> > the user current codepage is the best in this situation?
> 
> For the encoding of the document, @documentencoding should work on
> Windows as it does elsewhere.  So I'm not sure why we use a different
> default.  is that only for the case where there's no @documentencoding
> in the Texinfo source?  If not, when will this default be used?

There is not a different default for the document encoding on Windows,
the difference is only for file names encoding.

> The only part that is I think different on Windows is the encoding of
> file names, because Windows doesn't treat file names as opaque
> bytestreams.  But anything that comes from a Texinfo source, even the
> name of an included file, should be interpreted according to
> @documentencoding.  When accessing included files on Windows, we
> should re-encode the file names to the locale's encoding, because
> nothing else will work reliably.  Is that what we do?

Yes, but it does not work reliably either, as shown by the tests
results.  The test which uses the locale's encoding fails (formatting
manual_include_accented_file_name_latin1), while the test in which the
document encoding is used, (formatting
manual_include_accented_file_name_latin1_explicit_encoding) does not
fail.  As analysed just before, it works because both Windows and Perl
are consistently wrong, but still it seems to work better.

-- 
Pat



reply via email to

[Prev in Thread] Current Thread [Next in Thread]