octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #52479] textscan ignores leading spaces when c


From: Dan Sebald
Subject: [Octave-bug-tracker] [bug #52479] textscan ignores leading spaces when creating cell from string/file
Date: Wed, 22 Nov 2017 15:12:01 -0500 (EST)
User-agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:55.0) Gecko/20100101 Firefox/55.0

Follow-up Comment #2, bug #52479 (project octave):

It looks like the scanning routines all do skipping of white space in some
form:


  textscan::scan_one (delimited_stream& is, const textscan_format_elt& fmt,
                      octave_value& ov, Array<octave_idx_type> row)
  {
    skip_whitespace (is);


However, at the instantiation, the default white space includes the \t
character:


  textscan::textscan (const std::string& who_arg)
    : who (who_arg), buf (), whitespace_table (), delim_table (),
      delims (), comment_style (), comment_len (0), comment_char (-2),
      buffer_size (0), date_locale (), inf_nan (init_inf_nan ()),
      empty_value (numeric_limits<double>::NaN ()), exp_chars ("edED"),
      header_lines (0), treat_as_empty (), treat_as_empty_len (0),
      whitespace (" \b\t"), eol1 ('\r'), eol2 ('\n'),
      return_on_error (1), collect_output (false),
      multiple_delims_as_one (false), default_exp (true), lines (0)
  { }


To confirm this hypothesis, note that using a comma as a delimiter works:


octave:1> a = ",,a,b,c\n"
a = ,,a,b,c

octave:3> textscan(a, '%s', 'delimiter', sprintf(','))
ans =
{
  [1,1] =
  {
    [1,1] = 
    [2,1] = 
    [3,1] = a
    [4,1] = b
    [5,1] = c
  }

}


What needs to be done is remove any delimiters from the whitespace string.
That is, if the user selects


'delimiter', sprintf('\t'))


as you've done, then that character should be subtracted from the string.

I'm attaching a patch that does just that.  It seems to be in a reasonable
location for modifying the whitespace list, although I do see a lot of
"isspace()" used throughout the textscan code; so whether the patch catches
everything, I'm not sure.  Perhaps JWE and Rik can review for a more efficient
use of strings or general fix.

(file #42460)
    _______________________________________________________

Additional Item Attachment:

File name: octave-textscan_whitespace_delimiter-djs2017nov22.patch Size:1 KB


    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?52479>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]