--- Begin Message ---
Subject: |
Re: charset=macintosh |
Date: |
Sun, 09 Mar 2003 04:56:44 +0100 |
User-agent: |
Gnus/5.090016 (Oort Gnus v0.16) Emacs/21.3.50 (gnu/linux) |
Simon Josefsson <jas@extundo.com> writes:
> But what if you are saying about UTF-8 clients being MIME capable is
> true, and since UTF-8 is typically never preferred by current emacsen,
> doesn't emacs' current guessing works the best we can hope for?
> Doesn't it detect among ISO-8859-X, ISO-2022 and Big5 properly?
No. I was hoping we could do something like this (for headers):
(let ((coding-systems (detect-coding-string string)))
(if (memq default coding-systems)
(decode-coding-string string default)
(decode-coding-string string (car coding-systems))))
i.e. if the default coding system is valid for the string, then use
that; otherwise use whatever Emacs thinks is the most likely coding
system. I think this would be ideal.
But unfortunately `detect-coding-string' _doesn't_ return a complete
list of possible coding systems. Consider this scenario:
I'm using Emacs in a Latin-1 locale. dk.* newsgroups work fine
because latin-1 is the default. But I also subscribe to, say, a few
Korean newsgroups. The entry in `gnus-groups-charset-alist':
("\\(^\\|:\\)han\\>" euc-kr)
should take care of selecting the proper default charset. But *oops*,
`detect-coding-string' doesn't think that euc-kr is a possible charset
for a Korean string encoded in euc-kr:
(detect-coding-string (encode-coding-string "안녕" 'euc-kr))
=> (iso-latin-1 iso-latin-1 raw-text japanese-shift-jis
chinese-big5 no-conversion)
So the above approach would fail.
> 2) Users with emacs in UTF-8 prefers UTF-8 too often, even when the
> data is invalid UTF-8 and another encoding should be selected.
>
> The second situation is a bug, and I hope we can fix this.
Yep, 2) is the most serious problem. Especially because more and more
people are (often unknowingly) using an UTF-8 locale because Redhat 8
switched to UTF-8 by default. Those people would experience Gnus as
broken when reading hierarchies like dk.* or de.*.
--- End Message ---