[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#43351: 27.1; [PATCH] Change ASCII handling in mm-charset-to-coding-s
From: |
Alex Bochannek |
Subject: |
bug#43351: 27.1; [PATCH] Change ASCII handling in mm-charset-to-coding-system to us-ascii |
Date: |
Sat, 12 Sep 2020 00:04:15 -0700 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/27.1 (darwin) |
Hello!
This is a very small patch, but I am not confident that there aren't
other side effects, so please evaluate it carefully.
In the fix for bug#5458 (2011-06-30), a change was made to
mm-charset-to-coding-system to support "ansi.x3.4*" as an alias for
'ascii. As part of that patch 'us-ascii was also mapped to 'ascii. This
is problematic because decode-coding-string does not recognize 'ascii as
a coding system and throws an "Invalid coding system: ascii" exception.
As a result, when using gnus-article-browse-html-article (K H) to
display a text/html message that has charset=us-ascii (or presumably
also charset=ascii), the display will fail iff the header of the message
is not ASCII.
Tracing gnus-article-browse-html-parts the call chain in my test case
looks like this:
(setq hcharset (mm-find-mime-charset-region (point-min)(point-max)))
returns 'utf-8 because of the RFC 2047 encoded words in the
from-header. The HTML part has charset=us-ascii and therefore coding and
charset differ. (setq body (mm-charset-to-coding-system charset nil t))
then sets 'us-ascii to 'ascii (see above) and the attempt to transcode
the part into 'utf-8 fails at (encode-coding-string
(decode-coding-string content body) charset) That last piece of code
seems to have gone in on 2016-02-12 when removing XEmacs compat
functions from mm-util.el.
This patch no longer maps 'us-ascii and instead maps 'ascii to 'us-ascii
(The ANSI alias is untouched.) Alternatively, I could modify
gnus-article-browse-html-parts to special-case this, but I don't think
mm-charset-to-coding-system should output 'ascii if it is not a valid
coding system (anymore?) However, I don't know what else that could
possibly break, which is why I want to offer this patch with some
caution.
Please let me know if there is anything I can do to help with getting
this change accepted.
Thanks!
--
Alex. <abochannek@google.com>
diff --git a/lisp/gnus/mm-util.el b/lisp/gnus/mm-util.el
index 282465722d..3dc93e4ad4 100644
--- a/lisp/gnus/mm-util.el
+++ b/lisp/gnus/mm-util.el
@@ -137,9 +137,9 @@ mm-charset-to-coding-system
(let ((cs (cdr (assq charset mm-charset-override-alist))))
(and cs (mm-coding-system-p cs) cs))))
;; ascii
- ((or (eq charset 'us-ascii)
+ ((or (eq charset 'ascii)
(string-match "ansi.x3.4" (symbol-name charset)))
- 'ascii)
+ 'us-ascii)
;; Check to see whether we can handle this charset. (This depends
;; on there being some coding system matching each `mime-charset'
;; property defined, as there should be.)
- bug#43351: 27.1; [PATCH] Change ASCII handling in mm-charset-to-coding-system to us-ascii,
Alex Bochannek <=