[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Octave-bug-tracker] [bug #50619] textscan weird behaviour when reading
From: |
Dan Sebald |
Subject: |
[Octave-bug-tracker] [bug #50619] textscan weird behaviour when reading a csv |
Date: |
Sat, 25 Mar 2017 06:06:22 -0400 (EDT) |
User-agent: |
Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0 |
Follow-up Comment #8, bug #50619 (project octave):
I've tracked this down a bit, so I'm just writing some notes here for
reference:
I printed out the "is.tellg()" for:
void
textscan::scan_string (delimited_stream& is, const textscan_format_elt&
fmt,
std::string& val) const
{
if (delim_list.is_empty ())
{
unsigned int i = 0;
unsigned int width = fmt.width;
fprintf(stderr, "width=%d\n", width);
for (i = 0; i < width; i++)
{
fprintf(stderr,"+%d",i);
if (i+1 > val.length ())
val = val + val + ' '; // grow even if empty
int ch = is.get ();
if (is_delim (ch) || ch == std::istream::traits_type::eof ())
{
fprintf(stderr, "address = %u\n", is.tellg());
is.putback (ch);
break;
}
else
val[i] = ch;
}
val = val.substr (0, i); // trim pre-allocation
}
else // Cell array of multi-character delimiters
Here's the result for the test case:
+0+1+2+3+4+5+6+7+8address = 7867337
+0+1+2+3+4+5+6+7+8+9address = 7867347
+0+1+2+3+4+5+6+7+8+9address = 7867357
+0+1+2+3+4+5address = 7867363
+0+1+2+3+4+5address = 7867369
+0+1+2+3+4+5+6+7+8+9+10+11address = 7867381
+0+1+2+3+4+5+6+7+8+9+10+11+12+13address = 7867343
What this is telling me is that the pointer advances as expected with the
is.get(). That is, the count of +1, etc. is the number of characters added to
the pointer's previous value to get (hopefully) the next pointer address.
Except until the last field, the fourteen character "heading [deg]". In that
case the pointer makes some odd jump, going backward (!), as we'd expect
7867381 + 14 = 7867395.
This stream:
delimited_stream is (isp,
(delim_table.empty () ? whitespace + "\r\n" :
delims),
max_lookahead, buf_size);
isn't behaving nicely. The max_lookahead is 3, and the buf_size is 80. (I
recall somewhere else there being a buffer size of 4096...but don't take that
as being of some significance, as I don't quite understand the implication of
buf_size.)
I can see what is wrong. See the delims passed into this delimited stream?
Later in testing the ch = is.get() character with is_delim(ch), it's those
delims (a C++ std::string) that are looked for. Going into that is()
instantiation is only ";". So this delimited_stream doesn't recognize the
new-line character as a delimiter. It's just another character, so the
delimiter stream keeps reading until hitting another ";" character. There
must be some odd relationship between line length and buf_size that causes the
pointer to advance to some strange place in the next line for the next
textscan(). Note: I think that even though the col_headers looks to be
reading the "header [deg]" properly, I think it's not and somehow the new-line
character-plus (i.e., "\n5.2500000000000") is dropped somewhere along the way
when converted to cell-string.
So, as a little test, let's try putting ";\n" in for the delimiters in the
test code, i.e., textscan(file, formatSpec, 1, 'Delimiter', ";\n"):
+0+1+2+3+4+5+6+7+8address = 7866201
+0+1+2+3+4+5+6+7+8+9address = 7866211
+0+1+2+3+4+5+6+7+8+9address = 7866221
+0+1+2+3+4+5address = 7866227
+0+1+2+3+4+5address = 7866233
+0+1+2+3+4+5+6+7+8+9+10+11address = 7866245
+0+1+2+3+4+5+6+7+8+9+10+11+12+13address = 7866259
OK, now things look proper, i.e., 7866245 + 14 = 7866259. Unfortunately, the
result still isn't quite correct:
octave:16> logLine
logLine =
{
[1,1] = 0
[1,2] = 44
[1,3] = 10
[1,4] = 0
[1,5] = 0
[1,6] = 0
[1,7] = 44.998
}
Better! But the first entry isn't 5.25. Again, some strange interaction with
the new-line character and placing it back into the stream, maybe?
That's where I am. On the trail, I think, but only close so far.
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?50619>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
- [Octave-bug-tracker] [bug #50619] textscan weird behaviour when reading a csv, Andrea, 2017/03/23
- [Octave-bug-tracker] [bug #50619] textscan weird behaviour when reading a csv, Dan Sebald, 2017/03/23
- [Octave-bug-tracker] [bug #50619] textscan weird behaviour when reading a csv, Philip Nienhuis, 2017/03/23
- [Octave-bug-tracker] [bug #50619] textscan weird behaviour when reading a csv, Andrea, 2017/03/24
- [Octave-bug-tracker] [bug #50619] textscan weird behaviour when reading a csv, Philip Nienhuis, 2017/03/24
- [Octave-bug-tracker] [bug #50619] textscan weird behaviour when reading a csv, Andrea, 2017/03/24
- [Octave-bug-tracker] [bug #50619] textscan weird behaviour when reading a csv, Philip Nienhuis, 2017/03/24
- [Octave-bug-tracker] [bug #50619] textscan weird behaviour when reading a csv, Rik, 2017/03/24
- [Octave-bug-tracker] [bug #50619] textscan weird behaviour when reading a csv,
Dan Sebald <=
- [Octave-bug-tracker] [bug #50619] textscan weird behaviour when reading a csv, Philip Nienhuis, 2017/03/25
- [Octave-bug-tracker] [bug #50619] textscan weird behaviour when reading a csv, Philip Nienhuis, 2017/03/25
- [Octave-bug-tracker] [bug #50619] textscan weird behaviour when reading a csv, Dan Sebald, 2017/03/25
- [Octave-bug-tracker] [bug #50619] textscan weird behaviour when reading a csv, Dan Sebald, 2017/03/25
- [Octave-bug-tracker] [bug #50619] textscan weird behaviour when reading a csv, Dan Sebald, 2017/03/25
- [Octave-bug-tracker] [bug #50619] textscan weird behaviour when reading a csv, Philip Nienhuis, 2017/03/25