octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #60138] [octave forge] (dataframe) `dataframe`


From: Pascal Dupuis
Subject: [Octave-bug-tracker] [bug #60138] [octave forge] (dataframe) `dataframe` incorrectly parses csv entries containing commas inside quotes
Date: Sat, 27 Feb 2021 15:56:31 -0500 (EST)
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0.3 Safari/605.1.15

Follow-up Comment #3, bug #60138 (project octave):

Hello Tasos,
I'm glad you appreciate my efforts.
 
About the release ... that's another issue. A few years ago, Carne Draug would
have helped with some of the intricacies about releasing a new version. But he
resigned, and there is no more benevolent manager (that I know ) of
https://octave.sourceforge.io. Look at the copyright: 2018. Latest news: 26
Aug 2018.

I had a look in the IO package. Basically, the parsing is done in C, which is
much faster. Their code is about flipping some indicator when it detects a
quote, so separators which are inside quote are kept into the string. It's
better than mine, which is written in Octave, but it can't detect malformed
strings. There is a discussion on
https://stackoverflow.com/questions/18144431/regex-to-split-a-csv
about regexes to analyse a CSV string.  The subject is complex.

About R: data frames are a kind of matrix, where each column may have its own
subtype, and with meta-information. Plus you have the magic of OOP to use them
as first-class citizens: you ask for a matrix operation ? You get it. You ask
for a list view ? You get it. You want to select by column name ? What about
"myfit = lm(Y~X, data=mycsv)" ?  It's much more that syntactic sugar, it's
about elegance. You focus on the relationship between two variables from a
dataset, without caring for their positon.  They can be factored too.
Sophisticated regressions relies on dummy variables; and R operations like
'lm' do that the right way.

Those were the objectives I wanted to reach when I started to write the data
frame package. But 10 years later, I still do not have the basic
implementation running as it should.

Regards

Pascal
 

    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?60138>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]