octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpect


From: Dennis
Subject: [Octave-bug-tracker] [bug #59277] xls2oct and/or openxls behave unexpected
Date: Fri, 16 Oct 2020 08:23:52 -0400 (EDT)
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36

Follow-up Comment #5, bug #59277 (project octave):

As an example of the behavior described in comment #4, I have attached the
same dummy excel file, but with a 'payload' worksheet added. 

In addition, I renamed the java folders (see the pathnames mentioned in
comment #3), so that the UNO interface does not work. This solves both the
error, as well as the fact that xls2oct uses the UNO interface.

So, when I run the test script (comment #3) with the 'payload' sheet, the
profile output is as follows:

   #         Function Attr     Time (s)   Time (%)        Calls
---------------------------------------------------------------
   9         cell2mat             0.226      43.94          537
   8           regexp             0.111      21.66          285
  23 __OCT_xlsx2oct__             0.065      12.72            3
  13          cellfun             0.025       4.95         2439
   1          xls2oct             0.021       4.03            3
  49   parse_sp_range             0.009       1.80            3
  35       str2double             0.009       1.70           18
  32              cat             0.006       1.09          285
   6           ischar             0.004       0.77         1563
  43          col2num             0.004       0.73          742
  28            fread             0.003       0.64            6
  56           strrep             0.003       0.62           18
  14              all             0.003       0.62         1384
   3         prefix !             0.003       0.60         2487
  18             sort             0.003       0.60          255
  15             size             0.002       0.45          546
  26            fopen             0.002       0.31            6
  39          reshape             0.002       0.30          282
  10           nargin             0.001       0.28          559
   5          isempty             0.001       0.26          606


When that sheet is deleted, the profile is as follows:

   #         Function Attr     Time (s)   Time (%)        Calls
---------------------------------------------------------------
   8           regexp             0.116      35.95          111
   9         cell2mat             0.095      29.62          189
  23 __OCT_xlsx2oct__             0.034      10.68            3
  13          cellfun             0.017       5.23          873
  35       str2double             0.011       3.43           18
   1          xls2oct             0.007       2.32            3
  32              cat             0.005       1.62          111
   6           ischar             0.004       1.32         1563
  28            fread             0.004       1.15            6
  56           strrep             0.003       1.08           18
  43          col2num             0.003       1.02          742
  49   parse_sp_range             0.003       0.89            3
  26            fopen             0.002       0.59            6
  14              all             0.002       0.48          514
   3         prefix !             0.002       0.47          921
  62           strtok             0.001       0.45            3
  18             sort             0.001       0.37           81
  15             size             0.001       0.30          198
  50          deblank             0.001       0.29            3
  57            index             0.001       0.27            3


The time spend by cell2mat more than halves! I don't understand what causes
this, as xls2oct is called with a specific sheet. Moreover, __OCT_xlsx2oct__
only extract the specified sheet. What may cause this interference? Note: this
seems to be much worse when the cell contents are plain text urls (as in the
example).

To summarize, in my attempt to speed up things I have encountered two issues:
- xls2oct using UNO when OCT is specified
- the time to extract data from one worksheet is influenced by the presence
(and contents) of another worksheet, especially when it contains urls

I hope these issues can be solved.

(file #49993)
    _______________________________________________________

Additional Item Attachment:

File name: test.xlsx                      Size:65 KB
    <https://file.savannah.gnu.org/file/test.xlsx?file_id=49993>



    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?59277>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]