[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Standalone 'info' should recode into display's encoding
From: |
Gavin Smith |
Subject: |
Re: Standalone 'info' should recode into display's encoding |
Date: |
Tue, 21 Jan 2014 20:46:25 +0000 |
The attached patch implements some support for changing from a UTF-8
encoded file to an ASCII one. This is against SVN version 5400.
set_file_lc_type() checks if the file contains a Local Variables
section with a coding line. Then the convert_characters() function can
substitute characters in the file if the document encoding is
different from the encoding from the environment. At the moment only
three characters in UTF-8 are implemented.
I used the strcasecmp and nl_langinfo functions; I'm not sure if they
are standard.
I've attached a file I used to test this patch - it gives the expected
different behaviour in UTF-8 and Latin 1 terminals for me.
On Wed, Jan 1, 2014 at 12:15 AM, Karl Berry <address@hidden> wrote:
> In my experience, the problem is not specific to Info and not specific
> to quotes. If I run cat or more or ... on a UTF-8 file in a non-UTF-8
> terminal, characters are dropped and the result beyond 7-bit ASCII is
> garbled.
>
> This has always seemed like a fundamental problem in UTF-8 usage to me,
> one that would be better addressed at the terminal level, so at least
> one can always see the bytes, if not the "best possible"
> transliteration, without every single program that writes to stdout
> having to implement the same thing. But since nothing like that is
> going to happen, I suppose Info should somehow deal with it, just like
> every other program in the world. Sigh. Patches are welcome.
>
> As for controlling the output of quotes by makeinfo, an option could be
> invented, but I am not inclined to change the default behavior so I'm
> not convinced it has much utility. We changed it in the first place
> because of vociferous complaints about getting ASCII quotes even with
> @documentencoding UTF-8. And after all, there is some logic to using
> UTF-8 quotes when the document says it wants UTF-8. It's no different
> in principle than accented letters.
>
> At any rate, the best answer, IMHO, not requiring any changes to any
> programs, is simply not to use @documentencoding UTF-8 unless one
> actually needs it, which should be never in English-language manuals.
> 7-bit ASCII source with Texinfo @-commands is preferable. These days
> many people reflexively think that UTF-8 is wonderful, always use it,
> and want to inflict it on everyone else too, but that is simply wrong.
>
> karl
>
degrade_to_locale.patch
Description: Text Data
utf8.info
Description: Binary data
- Re: Standalone 'info' should recode into display's encoding, (continued)
- Re: Standalone 'info' should recode into display's encoding, Per Bothner, 2014/01/01
- Re: Standalone 'info' should recode into display's encoding, Eli Zaretskii, 2014/01/01
- Re: Standalone 'info' should recode into display's encoding, Per Bothner, 2014/01/01
- Re: Standalone 'info' should recode into display's encoding, Werner LEMBERG, 2014/01/02
- Re: Standalone 'info' should recode into display's encoding, Per Bothner, 2014/01/02
- Re: Standalone 'info' should recode into display's encoding, Werner LEMBERG, 2014/01/02
Re: Standalone 'info' should recode into display's encoding, Paul Eggert, 2014/01/01
Re: Standalone 'info' should recode into display's encoding, Patrice Dumas, 2014/01/01
Re: Standalone 'info' should recode into display's encoding,
Gavin Smith <=