bug-texinfo
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCHes] Add basic multibyte charset handling to makeinfo


From: Eli Zaretskii
Subject: Re: [PATCHes] Add basic multibyte charset handling to makeinfo
Date: Wed, 06 Dec 2006 06:28:12 +0200

> Date: Tue, 05 Dec 2006 12:51:29 +0100
> From: Miloslav Trmac <address@hidden>
> CC: Karl Berry <address@hidden>, address@hidden
> 
> - character set names are not portable across operating systems

Sorry, I don't think I understand.  Can you provide a few examples of
such non-portability?

> - even if you know that "iso-8859-1" is an acceptable character set
>   name, that doesn't mean a locale using that character set exists.
>   $current_locale.iso-8859-1 most likely doesn't exist.

There should be no problem to have a data base of valid locales.  In
fact, I wouldn't be surprised if such a data base already existed in
gettext or libiconv, and all we need to do is query them for a locale
that supports a given encoding.

Finally, we could even redesign @documentencoding to require a valid
locale name, not just an encoding name, if that would make the
difference.

> So, if we want @documentencoding, we can't use system locales, and we
> need a replacement that does at minimum the equivalents of mbtowc () and
> wcwidth ().  It is completely unreasonable to implement this directly
> inside texinfo sources, and I don't think it is really practical to make
> texinfo dependent on some other library that provides this functionality
> (ICU, maybe?).

I think, given the above and what Karl said, you could simply switch
locales dynamically, to support both the @documentencoding locale and
the current locale for diagnostic messages from makeinfo.

> The standalone info reader ignores the "Local Variables: coding: ..."
> trailer anyway, so the assumption that info files use the system's
> character set is already present, although makeinfo doesn't currently
> use it.

As Karl points out, not all Info readers ignore that.  Adding this
support to the stand-alone reader was not a priority as long as
@documentencoding was not really doing what we advertise; with the
acceptance of your changes, I hope Someone(tm) will find time and
resources to do that.  But in any case, missing features in the Info
reader should not be a reason to prevent makeinfo from doing TRT.

> The UNIX world basically assumes a single system-wide character set (a
> single character set must be used for the names in the filesystem, at
> least);  while technically possible, adding character set indication to
> every text file format and character set conversion to every program
> using the file format is not practical: it is too much work, it adds
> confusing failure modes and it breaks the traditional text manipulation
> tool usage.

We are not talking about such a large change (although Emacs and the
modern Unicode-based editors already allow you to manipulate
multilanguage texts).  We are talking about a possibility to produce
a manual in French in a non-French locale.  I might then send such a
manual to someone who lives in a French locale, for example.

More generally, I'm deeply disturbed to hear in the year 2006
arguments that in effect say that m17n and i18n are not needed, since
l10n is good enough.  That was the idea 10 years ago; a lot of water
went under the bridge since then, and I thought we moved on...

Thanks.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]