[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpect
From: |
Dennis |
Subject: |
[Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected |
Date: |
Fri, 16 Oct 2020 10:53:07 -0400 (EDT) |
User-agent: |
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36 |
Follow-up Comment #6, bug #59277 (project octave):
As I have been really troubled by speed issues, I dived into the problem some
further. I know understand why additional sheets can have impact on the
performance. This is due to the way xlsx sheets are constructed. All plain
strings of all worksheets are stored in one xml file. When this gets big, many
cell2mat calls are made.
I have tested a very simple solution. In '__OCT_xlsx2oct__' I have replaced a
single line of code, namely line 146 by the new line 147 (see attached). This
line uses two cell2mat calls, and is part of a for loop. However, as far as I
can tell, the list of strings that is being processed here ALWAYS contains
single cells. That is why we could simply use {1}{1} instead of
cell2mat(cell2mat()). This is MUCH faster.
Using the attached test script and Excel, and uncommenting line 146 (with line
147 commented) (i.e. the original code), the result is:
Elapsed time is 0.393326 seconds.
# Function Attr Time (s) Time (%) Calls
---------------------------------------------------------------
10 cell2mat 0.156 41.26 537
9 regexp 0.088 23.23 285
24 __OCT_xlsx2oct__ 0.050 13.20 3
14 cellfun 0.017 4.38 2439
2 xls2oct 0.015 3.92 3
36 str2double 0.009 2.35 18
50 parse_sp_range 0.008 2.15 3
29 fread 0.005 1.34 6
33 cat 0.004 1.15 285
4 prefix ! 0.002 0.62 2487
27 fopen 0.002 0.61 6
19 sort 0.002 0.60 255
44 col2num 0.002 0.55 742
15 all 0.002 0.50 1384
7 ischar 0.002 0.44 1563
57 strrep 0.001 0.39 18
16 size 0.001 0.38 546
40 reshape 0.001 0.29 282
11 nargin 0.001 0.26 559
63 strtok 0.001 0.22 3
Commenting line 146 and uncommenting line 147 in '__OCT_xlsx2oct__' yields:
Elapsed time is 0.174529 seconds.
# Function Attr Time (s) Time (%) Calls
---------------------------------------------------------------
9 regexp 0.079 47.75 285
24 __OCT_xlsx2oct__ 0.028 17.02 3
2 xls2oct 0.013 7.55 3
36 str2double 0.008 5.07 18
50 parse_sp_range 0.006 3.88 3
10 cell2mat 0.006 3.77 21
14 cellfun 0.005 3.12 117
29 fread 0.004 2.23 6
33 cat 0.003 2.02 21
27 fopen 0.002 1.24 6
44 col2num 0.002 1.05 742
7 ischar 0.001 0.86 1563
57 strrep 0.001 0.77 18
63 strtok 0.001 0.51 3
41 clear 0.000 0.28 3
51 deblank 0.000 0.26 3
58 index 0.000 0.24 3
31 fclose 0.000 0.24 6
8 chknmrange 0.000 0.23 3
35 strncmp 0.000 0.18 12
In my real life script, which uses a bigger Excel sheet, differences are much
more pronounced. As an example, using the original '__OCT_xlsx2oct__' code
yields:
Elapsed time is 10.1298 seconds.
# Function Attr Time (s) Time (%)
Calls
-----------------------------------------------------------------------------------
81 cell2mat 5.105 51.74
20903
101 __OCT_xlsx2oct__ 1.249 12.66
15
80 regexp 1.148 11.64
10657
29 cellfun 0.434 4.39
94317
73 system 0.402 4.07
1
While the newly proposed code yields:
Elapsed time is 3.81288 seconds.
# Function Attr Time (s) Time (%)
Calls
-----------------------------------------------------------------------------------
80 regexp 1.252 33.20
10657
101 __OCT_xlsx2oct__ 0.775 20.55
15
73 system 0.404 10.71
1
143 @Recipe/calc_magistral 0.169 4.48
1
28 load 0.145 3.84
16
That really makes a difference, the duration went from 10s to 7s.
@philipnienhuis, can you please check that this single updated line can be
implemented (i.e. the cell with strings indeed always contains a single nested
cell)? If so, could you please release a new version of io as soon as
possible?
NB: this solves the issue with additional sheets having an effect on speed,
but it doesn't solve the issue that xls2oct sometime uses UNO even tough OCT
is specified.
(file #49996, file #49997, file #49998)
_______________________________________________________
Additional Item Attachment:
File name: __OCT_xlsx2oct__.m Size:9 KB
<https://file.savannah.gnu.org/file/__OCT_xlsx2oct__.m?file_id=49996>
File name: BLAAAAA.m Size:0 KB
<https://file.savannah.gnu.org/file/BLAAAAA.m?file_id=49997>
File name: test.xlsx Size:65 KB
<https://file.savannah.gnu.org/file/test.xlsx?file_id=49998>
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?59277>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
- [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected, Dennis, 2020/10/15
- [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected, Dennis, 2020/10/15
- [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected, Philip Nienhuis, 2020/10/15
- [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected, Dennis, 2020/10/16
- [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected, Dennis, 2020/10/16
- [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected, Dennis, 2020/10/16
- [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected,
Dennis <=
- [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected, Philip Nienhuis, 2020/10/16
- [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected, Dennis, 2020/10/16
- [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected, Philip Nienhuis, 2020/10/17
- [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected, Philip Nienhuis, 2020/10/18
- [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected, Dennis, 2020/10/18
- [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected, Dennis, 2020/10/18
- [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected, Philip Nienhuis, 2020/10/18
- [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected, Dennis, 2020/10/19
- [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected, Dennis, 2020/10/19
- [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected, Philip Nienhuis, 2020/10/20