bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#43351: 27.1; [PATCH] Change ASCII handling in mm-charset-to-coding-s


From: Alex Bochannek
Subject: bug#43351: 27.1; [PATCH] Change ASCII handling in mm-charset-to-coding-system to us-ascii
Date: Sat, 12 Sep 2020 00:04:15 -0700
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (darwin)

Hello!

This is a very small patch, but I am not confident that there aren't
other side effects, so please evaluate it carefully.

In the fix for bug#5458 (2011-06-30), a change was made to
mm-charset-to-coding-system to support "ansi.x3.4*" as an alias for
'ascii. As part of that patch 'us-ascii was also mapped to 'ascii. This
is problematic because decode-coding-string does not recognize 'ascii as
a coding system and throws an "Invalid coding system: ascii" exception.

As a result, when using gnus-article-browse-html-article (K H) to
display a text/html message that has charset=us-ascii (or presumably
also charset=ascii), the display will fail iff the header of the message
is not ASCII.

Tracing gnus-article-browse-html-parts the call chain in my test case
looks like this:

(setq hcharset (mm-find-mime-charset-region (point-min)(point-max)))
returns 'utf-8 because of the RFC 2047 encoded words in the
from-header. The HTML part has charset=us-ascii and therefore coding and
charset differ. (setq body (mm-charset-to-coding-system charset nil t))
then sets 'us-ascii to 'ascii (see above) and the attempt to transcode
the part into 'utf-8 fails at (encode-coding-string
(decode-coding-string content body) charset) That last piece of code
seems to have gone in on 2016-02-12 when removing XEmacs compat
functions from mm-util.el.

This patch no longer maps 'us-ascii and instead maps 'ascii to 'us-ascii
(The ANSI alias is untouched.) Alternatively, I could modify
gnus-article-browse-html-parts to special-case this, but I don't think
mm-charset-to-coding-system should output 'ascii if it is not a valid
coding system (anymore?) However, I don't know what else that could
possibly break, which is why I want to offer this patch with some
caution.

Please let me know if there is anything I can do to help with getting
this change accepted.

Thanks!

-- 
Alex. <abochannek@google.com>
diff --git a/lisp/gnus/mm-util.el b/lisp/gnus/mm-util.el
index 282465722d..3dc93e4ad4 100644
--- a/lisp/gnus/mm-util.el
+++ b/lisp/gnus/mm-util.el
@@ -137,9 +137,9 @@ mm-charset-to-coding-system
         (let ((cs (cdr (assq charset mm-charset-override-alist))))
           (and cs (mm-coding-system-p cs) cs))))
    ;; ascii
-   ((or (eq charset 'us-ascii)
+   ((or (eq charset 'ascii)
        (string-match "ansi.x3.4" (symbol-name charset)))
-    'ascii)
+    'us-ascii)
    ;; Check to see whether we can handle this charset.  (This depends
    ;; on there being some coding system matching each `mime-charset'
    ;; property defined, as there should be.)

reply via email to

[Prev in Thread] Current Thread [Next in Thread]