[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Octave-bug-tracker] [bug #55452] fopen() does not support encoding argu
From: |
Markus Mützel |
Subject: |
[Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument |
Date: |
Sat, 9 Mar 2019 11:10:59 -0500 (EST) |
User-agent: |
Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:66.0) Gecko/20100101 Firefox/66.0 |
Follow-up Comment #10, bug #55452 (project octave):
Thanks for your testing.
I only checked with fprintf (fid, "%s", string), and fscanf (fid, "%s")
before. I didn't have a look at "fgetl" yet. It looks like these functions
take different code paths.
If I replace the function "slurp_file_one_line" in your test suite with the
following, the results look a little bit better:
function out = slurp_file_one_line (file, encoding)
try
[fh, msg] = fopen (file, "r", "native", encoding);
if fh < 0
error ("Failed opening file for reading: %s: %s", msg, file);
endif
# out = fgetl (fh);
out = fscanf (fh, "%s");
fclose (fh);
out = out(:)';
catch err
err
out = "";
end_try_catch
endfunction
On Windows, there seem to be at least two more different bugs:
>> run_bug_55452_tests
Running fixed-text encoded file test ex-001:
Reference text: Hello,world! (12 chars)
running: ex-001 ISO-8859-1
decoded: Hello,world! (12 chars)
ok: ex-001 ISO-8859-1
running: ex-001 ISO-8859-15
decoded: Hello,world! (12 chars)
ok: ex-001 ISO-8859-15
running: ex-001 KOI8-R
decoded: Hello,world! (12 chars)
ok: ex-001 KOI8-R
running: ex-001 SHIFT_JIS
decoded: Hello,world! (12 chars)
ok: ex-001 SHIFT_JIS
running: ex-001 UTF-16
decoded: Hello,world! (12 chars)
ok: ex-001 UTF-16
running: ex-001 UTF-16 no-bom
err =
scalar structure containing the fields:
message = fopen: conversion from codepage 'utf-16' not supported
identifier =
stack =
3x1 struct array containing the fields:
file
name
line
column
scope
decoded: (0 chars)
FAIL: ex-001 UTF-16 no-bom
Running fixed-text encoded file test ex-002:
Reference text: あありりががととうう丸丸 (18 chars)
running: ex-002 SHIFT_JIS
decoded: あありりががととうう丸丸 (18 chars)
ok: ex-002 SHIFT_JIS
running: ex-002 UTF-16
decoded: あありりががととうう丸丸 (18 chars)
ok: ex-002 UTF-16
Running fixed-text encoded file test ex-003:
Reference text: KaßnerÖkonomSchöpsÜbermutMüller (36 chars)
running: ex-003 ISO-8859-1
decoded: KaßnerÖkonomSchöpsÜbermutMüller (36 chars)
ok: ex-003 ISO-8859-1
running: ex-003 UTF-16
decoded: KaßnerÖkonomSchöpsÜbermutMüller (36 chars)
ok: ex-003 UTF-16
There doesn't seem to be a convenient function to get the number of characters
in a string straight away (or I forgot about it). "numel" returns the number
of bytes in the char array. Maybe "max (unicode_idx (str))" would be more
correct.
Back on topic:
In the f* family of functions, I think that "fwrite" and "fread" should ignore
the encoding and just handle "pure bytes".
"fputs" and "fprintf" (%s format arguments and the format string itself)
should probably convert to the specified encoding.
"fgetl" and "fscanf" (%s format arguments) should be converted from the
specified encoding.
I am not sure how to handle "fgets": Should we just read one byte and return
that? Or should we make sure that we read one character (whatever the number
of bytes necessary)?
Please let me know if I'm missing something.
I never worked with multi-byte encodings like SHIFT-JIS. How do they encode
ASCII characters? I am wondering if fprintf correctly treads the format string
on current default.
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?55452>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
- [Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument, Andrew Janke, 2019/03/09
- [Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument, Andrew Janke, 2019/03/09
- [Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument, Andrew Janke, 2019/03/09
- [Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument, Andrew Janke, 2019/03/09
- [Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument,
Markus Mützel <=
- [Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument, Markus Mützel, 2019/03/09
- [Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument, Andrew Janke, 2019/03/09
- [Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument, Andrew Janke, 2019/03/09
- [Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument, Markus Mützel, 2019/03/09
- [Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument, Markus Mützel, 2019/03/09
- [Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument, Andrew Janke, 2019/03/09
- [Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument, Andrew Janke, 2019/03/09
- [Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument, Andrew Janke, 2019/03/09
- [Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument, Andrew Janke, 2019/03/09
- [Octave-bug-tracker] [bug #55452] fopen() does not support encoding argument, Markus Mützel, 2019/03/09