bug-texinfo
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Skip filename recoding tests on MS-Windows


From: Gavin Smith
Subject: Re: Skip filename recoding tests on MS-Windows
Date: Wed, 26 Oct 2022 17:34:56 +0100

On Wed, Oct 26, 2022 at 05:47:54PM +0200, pertusus@free.fr wrote:
> On Wed, Oct 26, 2022 at 03:14:26PM +0300, Eli Zaretskii wrote:
> > > Date: Wed, 26 Oct 2022 11:03:53 +0200
> > > From: pertusus@free.fr
> > > Cc: GavinSmith0123@gmail.com, bug-texinfo@gnu.org
> > > 
> > > Lets call LOC your locale.  The setup is a manual encoded in Latin1, and
> > > an include file included_latîn1.texi.  On your computer, the î in the
> > > include file is stored as 0x05DE, which is the conversion of 0xEE in the
> > > LOC codepage.
> > 
> > For this to work, the non-ASCII character we use should be encodable
> > both by Latin1 and by the Windows codepages.  This is a tough
> > requirement, but if you look at the tables of these encodings, you
> > will see that some codepoints between 0xA1 and 0xAF are identical
> > between many Windows codepages and Latin1.  For example, 0xAB is
> > identical in many codepages.  So maybe we could try such a character,
> > for these tests?
> 
> I set the character to the Yen and Yuan sign which is in the range.
> 
> It is not fully clear to me how this changes what we do with the tests,
> though...

I thought we were going to skip these tests.

I heard that ¥ was a forbidden character on Windows filesystems,
due to confusion with the backslash, so I can imagine that this could
easily cause problems.

https://learn.microsoft.com/en-us/windows/win32/intl/character-sets-used-in-file-names


> Caution
> 
> Windows code page and OEM code page character sets used on
> Japanese-language operating systems contain the Yen symbol (¥) instead
> of a backslash (\). Thus, the Yen symbol is a prohibited character for
> NTFS and FAT file systems. When mapping Unicode to a Japanese-language
> code page, WideCharToMultiByte and other conversion functions map
> both backslash (U+005C) and the normal Unicode Yen symbol (U+00A5) to
> this same character. For security reasons, your applications should
> not typically allow the character U+00A5 in a Unicode string that
> might be converted for use as a FAT file name. For more information,
> see Security Considerations: International Features.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]