[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#44307: 27.1; UTF-8 parts transferred as 8bit in multipart messages f
From: |
Alexandre Duret-Lutz |
Subject: |
bug#44307: 27.1; UTF-8 parts transferred as 8bit in multipart messages fail to decode |
Date: |
Thu, 07 Jan 2021 17:06:44 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) |
Lars Ingebrigtsen <larsi@gnus.org> writes:
> I've now committed a fix to mm-with-part that may or may not fix this
> nnmaildir problem.
Question: shouldn't mm-with-part always leave the buffer in unibyte
mode? The comment at the beginning of the macro seems to suggest that,
but the new "if" does not call (mm-disable-multibyte) after inserting
the part.
Otherwise that would be just pushing the issue further away, to the next
place where when the contents of mm-with-part will be inserted in a
unibyte buffer.
> Can you try this (in Emacs 28)? You may have to do a "make bootstrap"
> or at least remove all the lisp/gnus/*.elc files for the change to
> have any effect.
After "make bootstrap", this seems to fix only the rendering of
text/html utf-8 parts (I'm using w3m, if that matters). However
text/plain utf-8 parts are still garbled as they where before.
If I tweak the patch a follows:
--- a/lisp/gnus/mm-decode.el
+++ b/lisp/gnus/mm-decode.el
@@ -1271,7 +1271,9 @@ mm-with-part
;; multibyte buffer here, but if it's using an 8bit
;; Content-Transfer-Encoding, then work around that by
;; just ignoring the situation.
- (insert-buffer-substring (mm-handle-buffer handle))
+ (progn
+ (insert-buffer-substring (mm-handle-buffer handle))
+ (mm-disable-multibyte))
;; Do the decoding.
(mm-disable-multibyte)
(insert-buffer-substring (mm-handle-buffer handle))
this seems to fix text/plain utf-8 parts as well, however the
rendering of window-1252 parts is now broken...
See the following table, where "with patch" refers to
commit (23a887e4), and "disable-mb" to the above tweak.
|-------------+------------+---------------+------------+------------|
| charset | type | without patch | with patch | disable-mb |
|-------------+------------+---------------+------------+------------|
| utf-8 | text/html | garbled | ok | ok |
| window-1252 | test/html | ok | ok | garbled |
| utf-8 | text/plain | garbled | garbled | ok |
| window-1252 | test/plain | ok | ok | garbled |
When looking at window-1252-encoded mails read by nnmaildir, and
rendered using "C-u g" (where none of the above changes should matter),
it's obvious that the buffer contains utf-8 characters.
My guess is that when nnmaildir calls nnheader-insert-file-contents to
reads the mail, it does so with 'undecided coding. emacs then
automatically detect window-1252 and converts it to utf-8 for its
internal representation.
--
Alexandre Duret-Lutz
- bug#44307: 27.1; UTF-8 parts transferred as 8bit in multipart messages fail to decode, Alexandre Duret-Lutz, 2021/01/02
- bug#44307: 27.1; UTF-8 parts transferred as 8bit in multipart messages fail to decode, Alexandre Duret-Lutz, 2021/01/04
- bug#44307: 27.1; UTF-8 parts transferred as 8bit in multipart messages fail to decode, Lars Ingebrigtsen, 2021/01/05
- bug#44307: 27.1; UTF-8 parts transferred as 8bit in multipart messages fail to decode, Lars Ingebrigtsen, 2021/01/10
- bug#44307: 27.1; UTF-8 parts transferred as 8bit in multipart messages fail to decode, Alexandre Duret-Lutz, 2021/01/10
- bug#44307: 27.1; UTF-8 parts transferred as 8bit in multipart messages fail to decode, Lars Ingebrigtsen, 2021/01/10
- bug#44307: 27.1; UTF-8 parts transferred as 8bit in multipart messages fail to decode, Alexandre Duret-Lutz, 2021/01/10
- bug#44307: 27.1; UTF-8 parts transferred as 8bit in multipart messages fail to decode, Alexandre Duret-Lutz, 2021/01/10
- bug#44307: 27.1; UTF-8 parts transferred as 8bit in multipart messages fail to decode, Lars Ingebrigtsen, 2021/01/11