bug#43351: 27.1; [PATCH] Change ASCII handling in mm-charset-to-coding-s

bug-gnu-emacs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#43351: 27.1; [PATCH] Change ASCII handling in mm-charset-to-coding-s

From:	Alex Bochannek
Subject:	bug#43351: 27.1; [PATCH] Change ASCII handling in mm-charset-to-coding-system to us-ascii
Date:	Sat, 12 Sep 2020 00:04:15 -0700
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/27.1 (darwin)

Hello!

This is a very small patch, but I am not confident that there aren't
other side effects, so please evaluate it carefully.

In the fix for bug#5458 (2011-06-30), a change was made to
mm-charset-to-coding-system to support "ansi.x3.4*" as an alias for
'ascii. As part of that patch 'us-ascii was also mapped to 'ascii. This
is problematic because decode-coding-string does not recognize 'ascii as
a coding system and throws an "Invalid coding system: ascii" exception.

As a result, when using gnus-article-browse-html-article (K H) to
display a text/html message that has charset=us-ascii (or presumably
also charset=ascii), the display will fail iff the header of the message
is not ASCII.

Tracing gnus-article-browse-html-parts the call chain in my test case
looks like this:

(setq hcharset (mm-find-mime-charset-region (point-min)(point-max)))
returns 'utf-8 because of the RFC 2047 encoded words in the
from-header. The HTML part has charset=us-ascii and therefore coding and
charset differ. (setq body (mm-charset-to-coding-system charset nil t))
then sets 'us-ascii to 'ascii (see above) and the attempt to transcode
the part into 'utf-8 fails at (encode-coding-string
(decode-coding-string content body) charset) That last piece of code
seems to have gone in on 2016-02-12 when removing XEmacs compat
functions from mm-util.el.

This patch no longer maps 'us-ascii and instead maps 'ascii to 'us-ascii
(The ANSI alias is untouched.) Alternatively, I could modify
gnus-article-browse-html-parts to special-case this, but I don't think
mm-charset-to-coding-system should output 'ascii if it is not a valid
coding system (anymore?) However, I don't know what else that could
possibly break, which is why I want to offer this patch with some
caution.

Please let me know if there is anything I can do to help with getting
this change accepted.

Thanks!

-- 
Alex. <abochannek@google.com>

diff --git a/lisp/gnus/mm-util.el b/lisp/gnus/mm-util.el
index 282465722d..3dc93e4ad4 100644
--- a/lisp/gnus/mm-util.el
+++ b/lisp/gnus/mm-util.el
@@ -137,9 +137,9 @@ mm-charset-to-coding-system
         (let ((cs (cdr (assq charset mm-charset-override-alist))))
           (and cs (mm-coding-system-p cs) cs))))
    ;; ascii
-   ((or (eq charset 'us-ascii)
+   ((or (eq charset 'ascii)
        (string-match "ansi.x3.4" (symbol-name charset)))
-    'ascii)
+    'us-ascii)
    ;; Check to see whether we can handle this charset.  (This depends
    ;; on there being some coding system matching each `mime-charset'
    ;; property defined, as there should be.)

[Prev in Thread]

Current Thread

[Next in Thread]

bug#43351: 27.1; [PATCH] Change ASCII handling in mm-charset-to-coding-system to us-ascii, Alex Bochannek <=
- bug#43351: 27.1; [PATCH] Change ASCII handling in mm-charset-to-coding-system to us-ascii, Lars Ingebrigtsen, 2020/09/12
  - bug#43351: 27.1; [PATCH] Change ASCII handling in mm-charset-to-coding-system to us-ascii, Eli Zaretskii, 2020/09/12
    - bug#43351: 27.1; [PATCH] Change ASCII handling in mm-charset-to-coding-system to us-ascii, Alex Bochannek, 2020/09/12
    - bug#43351: 27.1; [PATCH] Change ASCII handling in mm-charset-to-coding-system to us-ascii, Lars Ingebrigtsen, 2020/09/13
    - bug#43351: 27.1; [PATCH] Change ASCII handling in mm-charset-to-coding-system to us-ascii, Lars Ingebrigtsen, 2020/09/13

Prev by Date: bug#43323: Rename cua-mode to something better?
Next by Date: bug#43314: vc-bzr-test-bug9726 test fails on master
Previous by thread: bug#41147: A modern frame-title-format
Next by thread: bug#43351: 27.1; [PATCH] Change ASCII handling in mm-charset-to-coding-system to us-ascii
Index(es):
- Date
- Thread