[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Skip filename recoding tests on MS-Windows

From: pertusus
Subject: Re: Skip filename recoding tests on MS-Windows
Date: Sun, 23 Oct 2022 20:27:38 +0200

On Sun, Oct 23, 2022 at 06:41:34PM +0100, Gavin Smith wrote:
> On Sun, Oct 23, 2022 at 08:13:24PM +0300, Eli Zaretskii wrote:
> > > I am pretty sure that this file is correctly generated, I guess that
> > > מ corresponds to the same octet than î in latin1, which is 0xEE unless I
> > > missed something, and your codepage would be Windows-1255, maybe?
> > 
> > Yes, it is.
> > 
> > > So, there seems to be no trouble creating a correctly encoded file name
> > > which, if interpreted as ISO-8859-1 gives the correct binary string.
> > 
> > Yes.  I think the problem is not in generating the file name, it is in
> > using that file later.
> Is the filesystem on Windows not usually NTFS which stores filenames in
> UTF-16?  So the file would be created with some UTF-16 name, even if it
> appears to programs in some 8-bit encoding depending on the code page.
> It seems relevant what the file name actually created is.  If it is not
> created with the correct name then it would not be possible to open it.

I tried to analyse what is going on in another mail I sent.  I may be
completely off, but if the file name is actually UTF-16 and it is the
encoding stat() is expecting, then it is not surprising that the file
name cannot be found.  If stat() uses the locale to convert from the
locale to UTF-16 it is also not surprising that the file name cannot
be found, unless the locale is, by chance, a latin1 locale.

> Is it possible for you to find the "included_latמn1.texi" file in the
> Windows file explorer and check what its name really is?
> I'm really doubtful that these tests can be made to work - if you are
> limited to an 8-bit encoding that is not Latin-1, how are tests using
> Latin-1 only characters going to work?  It seems easier for all involved
> to skip these tests.

It could still work if the byte used is correct, like \xEE for î even if
it is not the correct locale.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]