octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpect


From: Philip Nienhuis
Subject: [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected
Date: Fri, 16 Oct 2020 16:40:32 -0400 (EDT)
User-agent: Mozilla/5.0 (Windows NT 6.1; rv:52.0) Gecko/20100101 Firefox/52.0

Update of bug #59277 (project octave):

                  Status:               Need Info => Confirmed              
             Assigned to:                    None => philipnienhuis         
                 Release:                   5.2.0 => other                  
        Operating System:       Microsoft Windows => Any                    

    _______________________________________________________

Follow-up Comment #7:

@Dennis,
your bug report title "behave unexpected" allows some thread drift and that is
happening a bit here.
Up till now I see 2 issues:

1. Interface trouble
You note the UNO interface being activated.
I think I found out how that can happen, and I have a fix as a corollary of
bug #59273 (needs verification over there). Bug #59273 is partly another
manifestation of the same bug affecting you here.
Bug status "confirmed" applies to this issue.

2. Speed (lack of)
IIRC the sharedStrings XML can contain nested or at least very complicated
strings; IIRC there was even a bug report about it 2 or 3 years ago. So
simplifying the code that handles it needs proper diligence.
Note that there are also "shared formulas", we haven't implemented support for
that yet but that will also rely on how we read & update sharedStrings. (Most
users just use cached cell values but other interfaces allow to uncover the
underlying formulas rather than their results and that doesn't fully work yet
in OCT.)

I'll soon come up with a patch for the interface issues, hopefully you also
want to test that for me (like the one who entered bug #59273). That issue by
itself warrants a bug fix release 2.6.3 of the io package.
I'm sorry you didn't report it earlier on.

As to speed, I doubt if I have enough time to dive into it now, but that
*will* happen. The nice thing about Octave is that it is open source so you
don't need to wait until it's fixed in a next release. As I understand it
you've already fixed it for yourself, hopefully if you test that a bit more I
can accept it with more confidence.
It recently emerged that function calls in cell2mat work faster as follows:
cell2mat ("<function name>", <cell array>, ...)
then
cell2mat (@<function name>, <cell array., ...)
so perhaps you can shave off some more execution time.

BTW if you happen to have Excel installed on your machine, the COM interface
may work up to 5 or more times faster than OCT - but for large spreadsheets it
may fill up RAM very soon.
The regular expressions that OCT uses are known to be slow for very very long
strings.

For really large datasets I'd advise to reconsider if you really need them in
spreadsheets. Dedicated databases usually outperform any spreadsheet and
because of more rigid business rules will save you from many sneaky errors
commonly seen in spreadsheets like incomplete cell references, badly copied
cells, dangling cell cross-references, etc etc. Not to mention the stuff that
Excel and even LibreOffice do, or don't, behind your back (e.g., recalculation
and cell content updates, implicit formatting affecting content formats).


    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?59277>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]