bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#38587: base64-decode-region breaks encoding


From: Juri Linkov
Subject: bug#38587: base64-decode-region breaks encoding
Date: Mon, 16 Dec 2019 23:51:48 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (x86_64-pc-linux-gnu)

>> But is it still possible to tell base64-decode-region
>> about the expected output coding system?  Maybe using
>> a prefix arg: C-u M-x base64-decode-region could ask
>> for a coding, defaulting to the buffer's coding.
>
> If we want to make such a change, then "C-x RET c" is a better prefix
> command, as it is consistent with other commands that accept
> coding-system overrides.
>
>> Is there an equivalent of force_encoding('UTF-8') in Emacs?
>
> "C-x RET c utf-8 RET M-x SOME-COMMAND RET"

I see that 'C-x RET c' just sets coding-system-for-read and
coding-system-for-write for the next command, so could
base64-decode-region get coding from these variables?

> It will work if you encode "ä" first:
>
>   (decode-coding-string (base64-decode-string
>                          (base64-encode-string
>                         (encode-coding-string "ä" 'utf-8)))
>                       'utf-8)

Thanks, this works for strings.

My real need was to find a way to decode base64 regions
that were encoded with UTF-8 coding.

First I tried to find such post-processing that would
recover "broken" characters inserted by base64-decode-region.
It seems these characters represent bytes that are parts
of the UTF-8 characters encoded in the UTF-8 buffer
using eight-bit charset.  I failed to find such functions
that would convert the result of base64-decode-region
to UTF-8 characters in the UTF-8 buffer.

So I wrote a replacement of base64-decode-region:

(defun base64-decode-utf8-region (beg end)
  (interactive "r")
  (replace-region-contents beg end
   (lambda ()
     (decode-coding-string
      (base64-decode-string
       (buffer-substring beg end))
      (or coding-system-for-write 'utf-8)))))

But the question remains: is it possible to do the same
in a simpler way without the need to write a new command?





reply via email to

[Prev in Thread] Current Thread [Next in Thread]