octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpect


From: Dennis
Subject: [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected
Date: Mon, 19 Oct 2020 15:44:15 -0400 (EDT)
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.75 Safari/537.36

Follow-up Comment #14, bug #59277 (project octave):

Philip, 

I have read the bug report and comments of bug #51512 and looked for useable
Excel files in that thread. Unfortunately I can't find any. Doesn't matter, I
can make my own :-).

First test: an excel sheet with 1000x10 unique strings (see attachment 1 for
excel (Excel2019 stringTest1.xlsx) and attachment 2 for __OCT_xls2oct__.m).
Original code (executing line 146) yields:

Elapsed time is 20.6734 seconds.
   #         Function Attr     Time (s)   Time (%)        Calls
---------------------------------------------------------------
  32         cell2mat             9.210      45.08        20006
  23           regexp             6.132      30.02        10006
  13 __OCT_xlsx2oct__             2.054      10.06            1
  24          cellfun             0.864       4.23        90034
  40       str2double             0.505       2.47            3


Second try, commenting line 146, uncommenting line 147:

Elapsed time is 8.37976 seconds.
   #         Function Attr     Time (s)   Time (%)        Calls
---------------------------------------------------------------
  23           regexp             5.304      63.57        10006
  13 __OCT_xlsx2oct__             1.647      19.73            1
  32       str2double             0.439       5.26            3
  48           strrep             0.231       2.77            5
  26              cat             0.218       2.61            7


This is much faster, as previously concluded.

Try 3: commenting lines 145-148, uncommenting new line 149. This applies
regexp to the complete cell array strings at once. This yields:

Elapsed time is 4.11125 seconds.
   #   Function Attr     Time (s)   Time (%)        Calls
---------------------------------------------------------
  93     regexp             3.161      69.93           30
 114 str2double             0.259       5.73            6
 112        cat             0.183       4.04           14
  49     strrep             0.178       3.95          100
 103     system             0.122       2.69            1


This is even faster. Since this implementation uses nested cell2mat as in the
original code, it seems to me that this should be acceptable. The only
difference is that it cuts out the for loop.

In order to put this to the test a bit more, I applied the code with a
LibreOffice6 file with identical contents (attachment 3 (LibreOffice6
stringTest1.xlsx)). The performance is similar (perhaps even a bit faster).
The spreadsheet content is correctly reflected in rawarr.

Finally, I created an Excel file with a number of formulas that results in
strings. Some point to each other, or even with multilevel references. Others
use the indirect function, and yet others concat two strings. See attachment 4
(Excel2019 stringTest2.xlsx). Using the fastest option (regexp applied to the
complete string cellarray), functions correctly and the results are properly
shown in rawarr.

I tend to conclude that this is a proper performance fix.


(file #50017, file #50018, file #50019, file #50020)
    _______________________________________________________

Additional Item Attachment:

File name: Excel2019 stringTest1.xlsx     Size:73 KB
    <https://file.savannah.gnu.org/file/Excel2019
stringTest1.xlsx?file_id=50017>

File name: __OCT_xlsx2oct__.m             Size:9 KB
    <https://file.savannah.gnu.org/file/__OCT_xlsx2oct__.m?file_id=50018>

File name: LibreOffice6 stringTest1.xlsx  Size:129 KB
    <https://file.savannah.gnu.org/file/LibreOffice6
stringTest1.xlsx?file_id=50019>

File name: Excel2019 stringTest2.xlsx     Size:11 KB
    <https://file.savannah.gnu.org/file/Excel2019
stringTest2.xlsx?file_id=50020>



    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?59277>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]