[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Skip filename recoding tests on MS-Windows

From: Eli Zaretskii
Subject: Re: Skip filename recoding tests on MS-Windows
Date: Sun, 23 Oct 2022 22:00:19 +0300

> Date: Sun, 23 Oct 2022 20:22:29 +0200
> From: pertusus@free.fr
> Cc: GavinSmith0123@gmail.com, bug-texinfo@gnu.org
> On Sun, Oct 23, 2022 at 07:10:57PM +0300, Eli Zaretskii wrote:
> > > 
> > > Which is problematic, it means that with a correctly setup input file
> > > with latin1 encoded character in the name, something wrong is going on.
> > 
> > The character is supposed to be encoded in Latin1, but I don't think
> > it is, because Latin1 is not the locale's encoding here.
> I think that it is encoded in Latin1, as discussed in another mail.

The Perl program which creates the file wanted a Latin1 encoding, but
Windows has its own ideas about that, as I explained in my other mail.

> >  And some
> > programs involved in these tests could decide they don't understand
> > the character and replace it with a '?'.
> There aren't other programs in these tests than texi2any.  The ?
> appearing in the message may come from perl Encode which does not
> know how to encode from the perl internal encoding to the C locale
> sets up in the test as
> LC_ALL=C; export LC_ALL

I think it could come from the Bash I'm using here.

> The failure of manual_include_accented_file_name_latin1_explicit_encoding
> is more surprising to me, as in that case INPUT_FILE_NAME_ENCODING is
> set to ISO-8859-1, so I do not understand why the test fails, the
> reverse encoding from UTF-8 to ISO-8859-1 should lead to a path that can
> be found.  The function where paths are looked for is
> locate_include_file() in input.c, it could be where something unexpected
> happens, maybe if stat() on Windows does some kind of conversion.

'stat' doesn't do any conversions, it uses the bytestream in the
'char *' file-name argument we feed it.  What is expected to be in
that file-name argument?  Where did the UTF-8 encoded input file name
come from in that case?  Did we read it from the filesystem, from some
file in the source tree, or from somewhere else?

> Debugging further the 
> manual_include_accented_file_name_latin1_explicit_encoding
> test would require showing the string bytes before and after the call to
> encode_file_name() in end_line.c, and then, if the string bytes seem to
> match the expected latin1 string with \xEE for î, check if something
> unexpected happens in locate_include_file, maybe checking what are the
> values of filename to check if there is indeed one that should lead to
> stat giving a 0 return value.  
> I do not know if it is practical for you to do that Eli?

Would printf's to stderr in input.c be visible when running tests?  If
so, I can show these byte streams.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]