octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: performance problem with unique on sparse arrays


From: Jaroslav Hajek
Subject: Re: performance problem with unique on sparse arrays
Date: Fri, 19 Mar 2010 13:22:40 +0100

On Mon, Mar 1, 2010 at 9:43 PM, David Bateman <address@hidden> wrote:
> Jaroslav Hajek wrote:
>>
>> It would be nice to let the Sparse indexing benefit from similar
>> improvements as the Array indexing, and also clean up the interface in
>> a similar manner. Since sparse indexing is only 1D or 2D, reductions
>> are not needed here, but the innermost loop specializations would
>> likely make the code both cleaner and more efficient.
>> I'd like to do some work here after 3.4, but I would appreciate
>> assistance, because the coding will probably be more difficult (except
>> for the lack of n-d case, which will make it easier).
>>
>
> I understand the sparse indexing pretty well and would be happy to help out.
> The existing index, maybe_delete_elements and assign methods of the
> Sparse<T> class were written by taking the Array<T> methods from version
> 2.1.63 (at least I think it was about then) and then just converting them
> ad-hoc to Sparse versions that did exactly the same thing.  I suppose the
> same process needs to be done with a recent version of your Array<T>
> indexing changes and will probably be easier as these functions are now much
> simpler. A nice thing to add if we do that  is to make
>
> sm = spalloc (r, c, n);
> for i = 1 : r
>  for j = 1 : c
>    y = foo (i, j);
>    if (y)
>      sm(i, j) = y;
>    endif
>  endfor
> endfor
>
> not reallocate sm on every call to the Sparse<T>::assign method as it
> currently does. Fixing the indexing such that something like
>
> sm = sprandn (1e6,1e6,1e-6); sm(sm<1)= 0;
>
> does cause a memory overflow with sparse logical indexing would be really
> nice as well.
>
> Cheers
> David
>
>


Hi David,

I started modernizing the Sparse indexing interface. In particular the
interfaces are updated so that idx_vector is passed by a const
reference and bool flags are used rather than int. I removed the
obsolete calls like idx_vector::freeze and improved the code using new
features from octave_sort and idx_vector, and in the end I ended with
a rewrite, along with some induced changes.

The result is here:
http://hg.savannah.gnu.org/hgweb/octave/rev/0677c5d80b77

All tests pass for me. I tried this benchmark:

1;
function do_test (s)
  disp ("Extracting single element 1000 times");
  n = numel (s);
  jj = ceil (n*rand (1, 1000));
  tic;
  for j = jj
    s(j);
  endfor
  toc
  disp ("Extracting contiguous range 100 times");
  jj = sort (ceil (n*rand (2, 100)), 1);
  tic;
  for j = jj
    s(j(1):j(2));
  endfor
  toc
  disp ("Extracting arbitrary slice 2 times");
  jj = ceil (n*rand (round (n / 10), 2));
  tic;
  for j = jj
    s(j);
  endfor
  toc
endfunction

m = 1000;
density = 0.1;

disp ("Testing sparse column vector");
do_test (sprand (m*m, 1, density));

disp ("Testing sparse row vector");
do_test (sprand (1, m*m, density));

disp ("Testing sparse matrix");
do_test (sprand (m, m, density));

with the following result (Core 2 Duo @ 2.83GHz, g++ -03 -march=native):

address@hidden:~/devel/octave/main> ./run-octave -q ttsi.m
Testing sparse column vector
Extracting single element 1000 times
Elapsed time is 0.00372815 seconds.
Extracting contiguous range 100 times
Elapsed time is 0.00738716 seconds.
Extracting arbitrary slice 2 times
Elapsed time is 0.0264862 seconds.
Testing sparse row vector
Extracting single element 1000 times
Elapsed time is 0.00375104 seconds.
Extracting contiguous range 100 times
Elapsed time is 0.0875449 seconds.
Extracting arbitrary slice 2 times
Elapsed time is 0.019655 seconds.
Testing sparse matrix
Extracting single element 1000 times
Elapsed time is 0.00839305 seconds.
Extracting contiguous range 100 times
Elapsed time is 0.145814 seconds.
Extracting arbitrary slice 2 times
Elapsed time is 0.0266171 seconds.

with recent tip, I get

address@hidden:~/devel/octave/main> octave -q ttsi.m
Testing sparse column vector
Extracting single element 1000 times
Elapsed time is 0.258644 seconds.
Extracting contiguous range 100 times
Elapsed time is 0.0249069 seconds.
Extracting arbitrary slice 2 times
Elapsed time is 51.6195 seconds.
Testing sparse row vector
Extracting single element 1000 times
Elapsed time is 0.00383902 seconds.
Extracting contiguous range 100 times
Elapsed time is 0.116042 seconds.
Extracting arbitrary slice 2 times
Elapsed time is 0.00955701 seconds.
Testing sparse matrix
Extracting single element 1000 times
Elapsed time is 0.00820112 seconds.
Extracting contiguous range 100 times
Elapsed time is 6.28953 seconds.
Extracting arbitrary slice 2 times
Elapsed time is 0.0401349 seconds.

so it doesn't seem I slowed something down significantly except
perhaps the sparse row vector.
In that case, the scalar and cont. range is specialized, otherwise the
array is converted to full, indexed, and then sparsified.
I thought this was OK because a sparse row vector takes O(nc) space
anyway, so it's an inefficient way to store a sparse vector. Sparse
column vector is the way to go here.

You're welcome to review the changes and comment on them.

I'll address the 2D indexing next, then eventually indexed assignment.

regards

-- 
RNDr. Jaroslav Hajek, PhD
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz



reply via email to

[Prev in Thread] Current Thread [Next in Thread]