bug-texinfo
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Non-ASCII characters in @include search path


From: Eli Zaretskii
Subject: Re: Non-ASCII characters in @include search path
Date: Sun, 20 Feb 2022 15:35:57 +0200

> Date: Sun, 20 Feb 2022 14:28:23 +0100
> From: Patrice Dumas <pertusus@free.fr>
> 
> On Sun, Feb 20, 2022 at 01:09:06PM +0000, Gavin Smith wrote:
> > 
> > My thought was that the argument to -I could have been any sequence of 
> > bytes,
> > not necessarily correct UTF-8.  It would be wrong then to attempt any
> > encoding or decoding to a string formed from such an argument.
> 
> Indeed, that must be what is happening here.  I think that it is not
> necessarily wrong to do decoding.  Actually, if the locale is not
> consistent with the encoding expected for file names, it would be even
> better to first decode command line arguments to the perl internal
> unicode encoding, then encode to the encoding that should work for
> operations using filenames.
> 
> That is the solution that I would favor.

If you want the Texinfo sources to be in UTF-8 internally, it might be
impossible not to decode the command-line arguments into UTF-8.  Only
if the command-line argument is used to access file names, and doesn't
seep into the rest of the output, you can use the original byte
sequence.  And even then it might be problematic: e.g., what if the
argument of -I is in some non-UTF-8 encoding, and the source uses
@include with a non-ASCII file name encoded according top
@documentencoding, which is UTF-8?  You need to construct a complete
file name from 2 parts that are encoded differently.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]