[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Non-ASCII characters in @include search path
From: |
Patrice Dumas |
Subject: |
Re: Non-ASCII characters in @include search path |
Date: |
Sun, 20 Feb 2022 14:45:14 +0100 |
On Sun, Feb 20, 2022 at 03:35:57PM +0200, Eli Zaretskii wrote:
> > Date: Sun, 20 Feb 2022 14:28:23 +0100
> > From: Patrice Dumas <pertusus@free.fr>
> >
> > On Sun, Feb 20, 2022 at 01:09:06PM +0000, Gavin Smith wrote:
> > >
> > > My thought was that the argument to -I could have been any sequence of
> > > bytes,
> > > not necessarily correct UTF-8. It would be wrong then to attempt any
> > > encoding or decoding to a string formed from such an argument.
> >
> > Indeed, that must be what is happening here. I think that it is not
> > necessarily wrong to do decoding. Actually, if the locale is not
> > consistent with the encoding expected for file names, it would be even
> > better to first decode command line arguments to the perl internal
> > unicode encoding, then encode to the encoding that should work for
> > operations using filenames.
> >
> > That is the solution that I would favor.
>
> If you want the Texinfo sources to be in UTF-8 internally, it might be
> impossible not to decode the command-line arguments into UTF-8. Only
> if the command-line argument is used to access file names, and doesn't
> seep into the rest of the output, you can use the original byte
> sequence. And even then it might be problematic: e.g., what if the
> argument of -I is in some non-UTF-8 encoding, and the source uses
> @include with a non-ASCII file name encoded according top
> @documentencoding, which is UTF-8? You need to construct a complete
> file name from 2 parts that are encoded differently.
I agree with you, and that is what I was proposing to do, actually. I
propose to decode command line arguments in the perl unicode internal
encoding. And encode file names to the file system encoding as late as
possible.
--
Pat
- Re: Non-ASCII characters in @include search path, (continued)
- Re: Non-ASCII characters in @include search path, Patrice Dumas, 2022/02/19
- Re: Non-ASCII characters in @include search path, Patrice Dumas, 2022/02/19
- Re: Non-ASCII characters in @include search path, Gavin Smith, 2022/02/20
- Re: Non-ASCII characters in @include search path, Gavin Smith, 2022/02/20
- Re: Non-ASCII characters in @include search path, Gavin Smith, 2022/02/20
- Re: Non-ASCII characters in @include search path, Patrice Dumas, 2022/02/20
- Re: Non-ASCII characters in @include search path, Patrice Dumas, 2022/02/20
- Re: Non-ASCII characters in @include search path, Gavin Smith, 2022/02/20
- Re: Non-ASCII characters in @include search path, Patrice Dumas, 2022/02/20
- Re: Non-ASCII characters in @include search path, Eli Zaretskii, 2022/02/20
- Re: Non-ASCII characters in @include search path,
Patrice Dumas <=
- Re: Non-ASCII characters in @include search path, Gavin Smith, 2022/02/20
- Re: Non-ASCII characters in @include search path, Patrice Dumas, 2022/02/20
- Re: Non-ASCII characters in @include search path, Gavin Smith, 2022/02/20
- Re: Non-ASCII characters in @include search path, Patrice Dumas, 2022/02/20
- Re: Non-ASCII characters in @include search path, Gavin Smith, 2022/02/20
- Re: Non-ASCII characters in @include search path, Patrice Dumas, 2022/02/20
- Re: Non-ASCII characters in @include search path, Eli Zaretskii, 2022/02/20
- Re: Non-ASCII characters in @include search path, Gavin Smith, 2022/02/20
- Re: Non-ASCII characters in @include search path, Patrice Dumas, 2022/02/20
- Re: Non-ASCII characters in @include search path, Gavin Smith, 2022/02/20