bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#50391: 28.0.50; json-read non-ascii data results in malformed string


From: Lars Ingebrigtsen
Subject: bug#50391: 28.0.50; json-read non-ascii data results in malformed string
Date: Sun, 05 Sep 2021 10:08:35 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)

Zhiwei Chen <condy0919@gmail.com> writes:

> When fetch json from youdao (a dict service in China).
>
> #+begin_src elisp
> (url-retrieve
>   "https://dict.youdao.com/suggest?q=accumulate&le=eng&num=80&doctype=json";
>   (lambda (_status)
>     (goto-char (1+ url-http-end-of-headers))
>     (write-region (point) (point-max) "/tmp/acc1.json")))
> #+end_src
>
> Then C-x C-f "/tmp/acc1.json", the file is correctly encoded without 
>
> But If `json-read' then `json-insert', the file is malformed even if
> uchardet shows the encoding of the file is utf-8.

When you do the `write-region', Emacs writes the octets you received
from the web server to a file.  When Emacs loads that file in again, it
guesses that it's utf-8 and decodes it that way, so that's why that
works correctly.

> #+begin_src elisp
> (url-retrieve
>   "https://dict.youdao.com/suggest?q=accumulate&le=eng&num=80&doctype=json";
>   (lambda (_status)
>     (goto-char (1+ url-http-end-of-headers))
>     (let ((j (json-read)))
>     (with-temp-buffer
>       (json-insert j)
>       (write-region (point-min) (point-max) "/tmp/acc2.json")))))
> #+end_src

But here you're asking Emacs to use json-read on a buffer that's not
been decoded.  The http buffer at this points looks like this:

PNG image

You have to say (decode-coding-region (point) (point-max) 'utf-8) first
for that to work.  I.e.,

  (url-retrieve
   "https://dict.youdao.com/suggest?q=accumulate&le=eng&num=80&doctype=json";
   (lambda (_status)
     (goto-char (1+ url-http-end-of-headers))
     (let ((buf (current-buffer))
           (end (1+ url-http-end-of-headers)))
       (with-temp-buffer
         (insert-buffer-substring buf end)
         (goto-char (point-min))
         (let ((j (json-read)))
           (erase-buffer)
           (json-insert j)
           (write-region (point-min) (point-max) "/tmp/acc2.json"))))))


-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

reply via email to

[Prev in Thread] Current Thread [Next in Thread]