|
From: | Markus Mützel |
Subject: | [Octave-bug-tracker] [bug #59203] [octave forge] (io) Problem with xlsread importing accent marks |
Date: | Thu, 1 Oct 2020 16:17:57 -0400 (EDT) |
User-agent: | Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36 Edg/85.0.564.63 |
Follow-up Comment #11, bug #59203 (project octave): Please, remove "unicode2utf8" and "utf82unicode". I wrote those functions when I didn't understand how encoding works. They break the encoding when invoked for the OCT interface and they are de-activated for the COM interface. The attached patch does that change for "xls2oct.m" and implements the correct conversion for the COM interface (by default now). The COM interface can't read characters above Unicode code point 255 (see the attached example that correctly reads with OCT but not with COM). That might be an issue with the windows package. The characters are already lost in line 122 in "__COM_spsh2oct__.m". I added a FIXME note in "xls2oct.m" that the conversion might need revisiting when/if something should change there. I've only tested the OCT and COM interfaces. (file #49902, file #49903) _______________________________________________________ Additional Item Attachment: File name: bug59203_xlsread_unicode.patch Size:4 KB <https://file.savannah.gnu.org/file/bug59203_xlsread_unicode.patch?file_id=49902> File name: Unicode.xlsx Size:9 KB <https://file.savannah.gnu.org/file/Unicode.xlsx?file_id=49903> _______________________________________________________ Reply to this item at: <https://savannah.gnu.org/bugs/?59203> _______________________________________________ Message sent via Savannah https://savannah.gnu.org/
[Prev in Thread] | Current Thread | [Next in Thread] |