[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Octave-bug-tracker] [bug #59277] [octave-forge](io) xls2oct is slow whe
From: |
Dennis |
Subject: |
[Octave-bug-tracker] [bug #59277] [octave-forge](io) xls2oct is slow when a spreadsheet contains many text cells |
Date: |
Sun, 25 Oct 2020 07:53:58 -0400 (EDT) |
User-agent: |
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.111 Safari/537.36 |
Follow-up Comment #23, bug #59277 (project octave):
Philip, regarding comment #20:
> Would you please look
> 1. if it still works
> 2. if you still have the expected speed gain
> ?
I tested you file and it still works. Also, it still has the performance
improvement that I had before.
In addition, I created another test Excel sheet and small script (attached).
With the __OCT_xlsx2oct__.m from v2.6.2 the output is:
Elapsed time is 161.678 seconds.
# Function Attr Time (s) Time (%) Calls
---------------------------------------------------------------
32 cell2mat 83.546 52.37 200060
23 regexp 34.835 21.84 100060
13 __OCT_xlsx2oct__ 18.686 11.71 10
24 cellfun 7.233 4.53 910340
44 str2double 2.667 1.67 30
By far most of the time is spend on cell2mat
With the update you posted in comment #20, the output is:
Elapsed time is 53.296 seconds.
# Function Attr Time (s) Time (%) Calls
---------------------------------------------------------------
23 regexp 30.263 57.15 70
33 cell2mat 8.937 16.88 20080
13 __OCT_xlsx2oct__ 3.050 5.76 10
47 str2double 2.482 4.69 30
26 cat 2.148 4.06 20090
That is a major improvement. I think we can consider this sufficiently fixed,
as most time indeed is spend on regexp.
I had another look, which regexp is responsible, and it is NOT L.99. I changed
the test script to loop only once and measured the time spend on all regexps I
could find. Here are the results:
L.99: Elapsed time is 0.353959 seconds.
L.142: Elapsed time is 0.201676 seconds.
L.145: Elapsed time is 0.665056 seconds.
L.175: N/A
L.197: Elapsed time is 0.694173 seconds.
L.200: Elapsed time is 0.726311 seconds.
L.204: Elapsed time is 0.601505 seconds.
L.205: Elapsed time is 0.548254 seconds.
Clearly L.99 is not the biggest time consumer. Next, I profiles L.197-200 and
L.204-205. In both cases, the profiler shows that almost one second is spent
on regexp. So in these line, regexp is indeed the function responsible for
time consumption, not any other function.
In conclusion, if you want to optimize the speed any further, this seems to be
the lines of code where most effort should go.
Regarding you question of credits, that is fine, thanks :-).
Cheers,
Dennis
(file #50093, file #50094)
_______________________________________________________
Additional Item Attachment:
File name: Excel2019 stringTest1.xlsx Size:82 KB
<https://file.savannah.gnu.org/file/Excel2019
stringTest1.xlsx?file_id=50093>
File name: Test2.m Size:0 KB
<https://file.savannah.gnu.org/file/Test2.m?file_id=50094>
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?59277>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
- [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected, (continued)
- [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected, Dennis, 2020/10/19
- [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected, Dennis, 2020/10/19
- [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected, Philip Nienhuis, 2020/10/20
- [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected, Dennis, 2020/10/20
- [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected, Philip Nienhuis, 2020/10/20
- [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected, Dennis, 2020/10/21
- [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected, Philip Nienhuis, 2020/10/23
- [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected, Philip Nienhuis, 2020/10/23
- [Octave-bug-tracker] [bug #59277] [octave-forge](io) xls2oct is slow when a spreadsheet contains many text cells, Philip Nienhuis, 2020/10/24
- [Octave-bug-tracker] [bug #59277] [octave-forge](io) xls2oct is slow when a spreadsheet contains many text cells, Philip Nienhuis, 2020/10/24
- [Octave-bug-tracker] [bug #59277] [octave-forge](io) xls2oct is slow when a spreadsheet contains many text cells,
Dennis <=
- [Octave-bug-tracker] [bug #59277] [octave-forge](io) xls2oct is slow when a spreadsheet contains many text cells, Dennis, 2020/10/25
- [Octave-bug-tracker] [bug #59277] [octave-forge](io) xls2oct is slow when a spreadsheet contains many text cells, Philip Nienhuis, 2020/10/25
- [Octave-bug-tracker] [bug #59277] [octave-forge](io) xls2oct is slow when a spreadsheet contains many text cells, Philip Nienhuis, 2020/10/25
- [Octave-bug-tracker] [bug #59277] [octave-forge](io) xls2oct is slow when a spreadsheet contains many text cells, Dennis, 2020/10/26
- [Octave-bug-tracker] [bug #59277] [octave-forge](io) xls2oct is slow when a spreadsheet contains many text cells, Philip Nienhuis, 2020/10/26
- [Octave-bug-tracker] [bug #59277] [octave-forge](io) xls2oct is slow when a spreadsheet contains many text cells, Dennis, 2020/10/26
- [Octave-bug-tracker] [bug #59277] [octave-forge](io) xls2oct is slow when a spreadsheet contains many text cells, Philip Nienhuis, 2020/10/26