bug-mit-scheme
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #58893] read-xml-file fails with files that are not Unicode NFC-enc


From: Arthur A. Gleckler
Subject: [bug #58893] read-xml-file fails with files that are not Unicode NFC-encoded
Date: Mon, 3 Aug 2020 15:03:39 -0400 (EDT)
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36

URL:
  <https://savannah.gnu.org/bugs/?58893>

                 Summary: read-xml-file fails with files that are not Unicode
NFC-encoded
                 Project: MIT/GNU Scheme
            Submitted by: aag
            Submitted on: Mon 03 Aug 2020 07:03:37 PM UTC
                Category: None
                Severity: 3 - Normal
                Priority: 5 - Normal
              Item Group: None
                  Status: None
                 Privacy: Public
             Assigned to: None
         Originator Name: 
        Originator Email: 
             Open/Closed: Open
         Discussion Lock: Any
                Keywords: 

    _______________________________________________________

Details:

Given a file that is not NFC-encoded, this expression fails:

(read-xml-file "/tmp/reproduce.xml")
;The object "\nx xxxxxxx xxxxxx xx xxxx xxxxxx xx xxxx xxxxxx xx xx xxxxx
xxxx, <x>xxx xxx xx xxxxx,</x> xxx xx xxxxxxxx xx xxx xxxxxxx xxx xxx xxxxxxx.
xxx “xxx xx xxxxx” xxx xxxx xxxx xxx xxxxxx xxxxx xxxx xxxx xxxxx. x xxx
xxxxx xx xx xxxxx xxxx xxxxxxxxxxx xxxx xxx xxxxx xx xxxxxxxxxx xxxxxxx xxx
xxxxxxxx xx xxx’x xxx xxxxxxxxxxxxx. xxx xxxx xxxx xx xxxxxxxxxxx xxxx xxx
xxxxx xx xxxxxxxxxx xxxx xxxxx xx xxxxxxxxxxxxx xxxx, xx xxxxxxxx xxxxxxx,
xxxxxxxxxxx xxx xx xxx xxxxx xxxxx xxxxxxxx xx xxx xxxx. (xxx ...", passed as
an argument to string-find-first-index, is not the correct type.
;To continue, call RESTART with an option number:
; (RESTART 1) => Return to read-eval-print level 1.

2 error> 

I've attached the offending file, which was part of a podcast feed.  In order
to avoid copyright issues, I've replaced [a-zA-Z] with "x" throughout the
textual content.

read-xml-file reads the same file file after normalization using Emacs's
ucs-normalize-NFC-region.



    _______________________________________________________

File Attachments:


-------------------------------------------------------
Date: Mon 03 Aug 2020 07:03:37 PM UTC  Name: reproduce.xml  Size: 24KiB   By:
aag

<http://savannah.gnu.org/bugs/download.php?file_id=49620>

    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?58893>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]