Re: Standalone 'info' should recode into display's encoding

bug-texinfo

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Standalone 'info' should recode into display's encoding

From:	Gavin Smith
Subject:	Re: Standalone 'info' should recode into display's encoding
Date:	Tue, 21 Jan 2014 20:46:25 +0000

The attached patch implements some support for changing from a UTF-8
encoded file to an ASCII one. This is against SVN version 5400.
set_file_lc_type() checks if the file contains a Local Variables
section with a coding line. Then the convert_characters() function can
substitute characters in the file if the document encoding is
different from the encoding from the environment. At the moment only
three characters in UTF-8 are implemented.

I used the strcasecmp and nl_langinfo functions; I'm not sure if they
are standard.

I've attached a file I used to test this patch - it gives the expected
different behaviour in UTF-8 and Latin 1 terminals for me.

On Wed, Jan 1, 2014 at 12:15 AM, Karl Berry <address@hidden> wrote:
> In my experience, the problem is not specific to Info and not specific
> to quotes.  If I run cat or more or ... on a UTF-8 file in a non-UTF-8
> terminal, characters are dropped and the result beyond 7-bit ASCII is
> garbled.
>
> This has always seemed like a fundamental problem in UTF-8 usage to me,
> one that would be better addressed at the terminal level, so at least
> one can always see the bytes, if not the "best possible"
> transliteration, without every single program that writes to stdout
> having to implement the same thing.  But since nothing like that is
> going to happen, I suppose Info should somehow deal with it, just like
> every other program in the world.  Sigh.  Patches are welcome.
>
> As for controlling the output of quotes by makeinfo, an option could be
> invented, but I am not inclined to change the default behavior so I'm
> not convinced it has much utility.  We changed it in the first place
> because of vociferous complaints about getting ASCII quotes even with
> @documentencoding UTF-8.  And after all, there is some logic to using
> UTF-8 quotes when the document says it wants UTF-8.  It's no different
> in principle than accented letters.
>
> At any rate, the best answer, IMHO, not requiring any changes to any
> programs, is simply not to use @documentencoding UTF-8 unless one
> actually needs it, which should be never in English-language manuals.
> 7-bit ASCII source with Texinfo @-commands is preferable.  These days
> many people reflexively think that UTF-8 is wonderful, always use it,
> and want to inflict it on everyone else too, but that is simply wrong.
>
> karl
>

degrade_to_locale.patch
Description: Text Data

utf8.info
Description: Binary data

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Standalone 'info' should recode into display's encoding, (continued)
- Re: Standalone 'info' should recode into display's encoding, Patrice Dumas, 2014/01/01
  - Re: Standalone 'info' should recode into display's encoding, Eli Zaretskii, 2014/01/01
- Re: Standalone 'info' should recode into display's encoding, Gavin Smith <=
  - Re: Standalone 'info' should recode into display's encoding, Gavin Smith, 2014/01/22

Prev by Date: Re: Inconsistent documentation of h and H commands in Info
Next by Date: Re: Standalone 'info' should recode into display's encoding
Previous by thread: Re: Standalone 'info' should recode into display's encoding
Next by thread: Re: Standalone 'info' should recode into display's encoding
Index(es):
- Date
- Thread