bug-texinfo
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Non-ASCII characters in @include search path


From: Eli Zaretskii
Subject: Re: Non-ASCII characters in @include search path
Date: Thu, 24 Feb 2022 08:36:48 +0200

> From: Gavin Smith <gavinsmith0123@gmail.com>
> Date: Wed, 23 Feb 2022 20:38:08 +0000
> Cc: pertusus@free.fr, trash.paradise@protonmail.com, bug-texinfo@gnu.org
> 
> > Are you describing what we will do in makeinfo, or are you describing
> > how the current makeinfo, which doesn't re-encode file names, works?
> > 
> > If the latter, then Windows file-related APIs will assume that the
> > file names we pass to them (taken from the Texinfo source's @include
> > or @image directives) are KOI-8 encoded, and will attempt to convert
> > the UTF-8 byte sequences to UTF-16 as if they were KOI-8 encoded.  The
> > results will never be pretty, and if some byte doesn't exist in the
> > KOI-8 encoding, the conversion will yield a question mark '?' or a
> > space character; in the former case, the API call will likely fail
> > because '?' is not allowed in Windows file names.
> 
> I meant what we would do in makeinfo.  The behaviour you describe is
> useful to know.  If the codeset affects how filenames are accessed
> through the file APIs, then it makes sense to convert filenames to that
> codeset (for MS-Windows only).  On the other systems we support, there
> is not this extra layer of conversion where the files are stored with
> UTF-16 names but the file APIs take filenames encoded in the current
> codeset.

I'm questioning the wisdom of different behavior on Windows vs Posix
systems.  Why not convert the file names to the locale's codeset on
all the supported systems?  In most cases, that would be a no-op,
since most users of Posix systems use UTF-8 nowadays anyway.  But if
someone does use a different encoding for file names, they should
expect us to support the "normal" use case, whereby the file names are
encoded in the filesystem with the locale's codeset, and are
recognized even if the Texinfo source names them using a different
encoding.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]