[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Texinfo translation error, texinfo_document domain

From: Patrice Dumas
Subject: Re: Texinfo translation error, texinfo_document domain
Date: Tue, 1 Nov 2022 01:19:04 +0100

On Tue, Nov 01, 2022 at 12:47:22AM +0100, Bruno Haible wrote:
> Hi Patrice,
> Patrice Dumas wrote:
> > > My complaint was only about the (apparent / confused) need to use TeXinfo
> > > syntax *for non-ASCII characters*.
> > 
> > It should only be required for non-ASCII characters if the encoding of the
> > po file is us-ascii, for example in pt_BR.us-ascii.po the @-commands for
> > accents need to be used.
> > 
> > In other cases, I think that it is best to leave it to the translators.
> > They can use @-commands if they wish, and not use them if they don't
> > want to, both for accented letters and, more generally, for styling.
> > 
> > This is explainined in
> > https://www.gnu.org/software/texinfo/manual/texinfo/html_node/Internationalization-of-Document-Strings.html
> > 
> > (though upon reading it I realized that it is a bit out of date...).
> In this documentation page
> https://www.gnu.org/software/texinfo/manual/texinfo/html_node/Internationalization-of-Document-Strings.html
> some importance is given to the @documentencoding.
> The @documentencoding is documented in
> https://www.gnu.org/software/texinfo/manual/texinfo/html_node/_0040documentencoding.html
> The way I read this page, it recommends to use '@documentencoding US-ASCII'
> when possible.

Beware that this has changed in the forthcoming release in which UTF-8
is considered as preferred.

> When I combine this with what you wrote above, the recommendation is
>   - to use the US-ASCII encoding for the texinfo_document domain PO files,
>   - to use the @-commands for the non-ASCII characters in these PO files.

This is not the recommendation, this is a possibility, for the language
which have their non-ASCII characters that can be represented with

> Can we remove the obstacles that (apparently) discourage Japanese, Chinese,
> Hindi translators from producing PO files for the texinfo_document domain?
> Gavin observed that many translation teams are not very active. True. But if
> you give them enough years of time, and if the usual tools & procedures are
> supported, they will some day do the translations.
> 1) 
> https://www.gnu.org/software/texinfo/manual/texinfo/html_node/_0040documentencoding.html
> talks about the 'coding:' marker in Info files. Is it a problem to have an 
> info file
> with 'coding: UTF-8' nowadays?

No, and it is the default.

> If users are in an old-style ISO-8859-1 locale, then
> the 'info' program will hopefully convert the contents to ISO-8859-1 
> on-the-fly for
> display (like it surely does in the opposite case, when viewing an info file 
> with
> 'coding: ISO-8859-1' in an UTF-8 locale)?

I think so.

> 2) 
> https://www.gnu.org/software/texinfo/manual/texinfo/html_node/_0040documentencoding.html
> also says "in Info and plain text output ... accent constructs and special 
> characters
> ... are output as the actual 8-bit or UTF-8 character in the given encoding 
> where possible."
> Is this a problem? Nowadays, UTF-8 is the standard encoding for text files. 
> It's
> ISO-8859-1 encoded files which sometimes display in a weird way.

Indeed, with UTF-8, it should always be possible, the manual could be
changed to read

as the actual UTF-8 character, or as the actual 8-bit character where possible 

> 3) "For maximum portability of Texinfo documents across the many different 
> user
> environments in the world, we recommend sticking to 7-bit ASCII in the input"
> Is this recommendation still relevant (in view of the Japanese and Chinese 
> support)?

This has been removed already.

> 4) 
> https://www.gnu.org/software/texinfo/manual/texinfo/html_node/Internationalization-of-Document-Strings.html
> Why is the @documentencoding relevant here? Why can't TeXinfo just recode 
> things
> as needed?

We do, but it is relevant to be able to do something special with

> I guess that if @documentencoding were made to be irrelevant here, then above
> you would not need to say
>   "for example in pt_BR.us-ascii.po the @-commands for accents need to be 
> used."
> since there would be only one pt_BR.po, and translations would pick UTF-8 for 
> its
> encoding, like they do for so many other translation domains.

The idea here, is to give an option for translators, which makes the
encoding relevant.  But there is no obligation.  And it is possible to
use a file like fr.po and still use the @-commands for accented letters.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]