octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #59203] [octave forge] (io) Problem with xlsre


From: Markus Mützel
Subject: [Octave-bug-tracker] [bug #59203] [octave forge] (io) Problem with xlsread importing accent marks
Date: Fri, 2 Oct 2020 02:41:55 -0400 (EDT)
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36 Edg/85.0.564.63

Follow-up Comment #16, bug #59203 (project octave):

I used the attached script to try and write non-ASCII characters to .xlsx
files. It contains some ASCII, Latin-1, and two more characters inside and
outside the BMP. It is encoded in UTF-8. So you might have to change your
character encoding settings in Octave if you want to run tests with it.

With the attached patch, it is possible to write and read all of these
characters with the OCT interface. (Excel 2013 seems to have problems
displaying the character outside the BMP though. Not an Octave bug.)
The COM interface can only write and read characters from the Latin-1 subset
of Unicode (codepoints 0-255). It looks like the windows package is limited by
Octave's `char` type (only 8-bit) in this respect. So not much, we can do
about this here. I thought the best thing would be to replace unencodable
characters with a "?" (question mark).

I also changed the docstring of `utf82unicode` and `unicode2utf8` to make
clearer what they do and which limitations they have. With that change, it's
probably ok to keep them around for a bit longer if that's your preference.

That patch replaces the previous patches here.

I did all my tests on Windows 10. IIUC, the command window in Octave is broken
for UTF-8 on Windows 7. You shouldn't try and work around the broken command
window (e.g. by using the "wrong" encoding for character arrays). That would
break lots of other things. The variable editor should work though to check
the results.

(file #49908, file #49909)
    _______________________________________________________

Additional Item Attachment:

File name: bug59203_xls_unicode.patch     Size:9 KB
   
<https://file.savannah.gnu.org/file/bug59203_xls_unicode.patch?file_id=49908>

File name: tst_unicode_xlsx.m             Size:0 KB
    <https://file.savannah.gnu.org/file/tst_unicode_xlsx.m?file_id=49909>



    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?59203>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]