[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF
From: |
Eli Zaretskii |
Subject: |
Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files |
Date: |
Sun, 27 Sep 2015 11:55:36 +0300 |
> Cc: address@hidden, address@hidden, address@hidden
> From: Paul Eggert <address@hidden>
> Date: Sun, 27 Sep 2015 01:22:48 -0700
>
> Eli Zaretskii wrote:
> > I've also looked at the *.po files in the latest releases of GNU Make,
> > Gawk, Texinfo, and Binutils, and I find that between 20% and 25% of
> > such files still use non-UTF-8 encodings.
>
> Yes, and those files are a pain to look at with Emacs now, since it typically
> misguesses their encodings. Presumably Emacs should be looking at .po files'
> charset= decorations.
You need to install the po-mode.
But anyway, that's not the issue at hand. I just used those files as
indicators of preferences of some locales.
> > while I agree with you that UTF-8 encoded files are the majority
> > among non-ASCII files (and Emacs development aligns itself with that
> > fact very well), the non-UTF-8 minority, even in the Posix world, is
> > still significant enough, and we cannot possibly ignore it.
>
> Naturally we cannot ignore it. All I'm suggesting is that we change the
> default
> behavior so that it's more UTF-8 friendly, since that's the way the world is
> going. The old Emacs behavior should still be available, for people who need
> it.
You use "default" here in a sense that is different from what the Mule
stuff does. Since Emacs attempts to support i18n, not just l10n, it
cannot ask users to modify their defaults whenever they meet a file
that's decoded incorrectly. Emacs uses the defaults in this area as
the last resort, when no other information is available in the file
itself or its accompanying meta-data. That default is already as
friendly to UTF-8 as possible: UTF-8 is used in any locale where
that's the default. Going further, i.e. preferring UTF-8 in locales
whose preferences are different, will simply bring back the old bugs
and misfeatures of Emacs 20 and 21 which we worked so hard to
eradicate.
IMO, the _only_ sane way forward is to introduce more reliable ways of
detecting the encoding, whether by using some new kinds of meta-data
or by more extensive analysis of the text itself. (The latter
solution will probably have difficulties with decoding sub-process
output, but it could be very efficient with disk files and large
bodies of text made available to Emacs at once.)
IOW, I don't think we will be able to change our locale-derived
defaults any time soon. What we can do is minimize the probability of
having to fall back on those defaults. But this requires that
Someoneā¢ volunteers to revamp our detect_coding_* implementations in
that direction.
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, (continued)
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, Paul Eggert, 2015/09/26
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, Eli Zaretskii, 2015/09/27
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, David Kastrup, 2015/09/27
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, Rustom Mody, 2015/09/27
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, Eli Zaretskii, 2015/09/27
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, Paul Eggert, 2015/09/27
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files,
Eli Zaretskii <=
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, Andreas Schwab, 2015/09/27
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, David Kastrup, 2015/09/27
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, Eli Zaretskii, 2015/09/27
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, Eli Zaretskii, 2015/09/27
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, Eli Zaretskii, 2015/09/27
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, Paul Eggert, 2015/09/27
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, Eli Zaretskii, 2015/09/26
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, Paul Eggert, 2015/09/26
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, stephen, 2015/09/26
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, Paul Eggert, 2015/09/27