[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
lazy contiguous subrange indexing
From: |
John W. Eaton |
Subject: |
lazy contiguous subrange indexing |
Date: |
Fri, 16 Jan 2009 16:24:22 -0500 |
On 14-Jan-2009, Jaroslav Hajek wrote:
| Summary (1st patch):
| If a dense indexing expression can be reduced to a contiguous subrange
| (like x(2:end-1), A(:,:,3:10) etc.), a shared (i.e. cheap) copy is
| made and the actual extraction is postponed until (if ever) the value
| is written to. The obvious benefit is greater performance of various
| operations with indexed expressions, such as y = x(1:n-1) + x(2:n),
| which will avoid making two temporary copies. This is done by
| modifying Array<T> to allow it work with contiguous subranges of its
| (possibly shared) data. There is an obvious risk of increased memory
| consumption, because a big block of memory may be kept allocated while
| only part of it is actually used.
| This is resolved by checking prior to storing a value to a "permanent"
| location (same where checks for null matrices are done), and possibly
| economizing if an orphaned block is detected. The detection (see
| Array<T>::maybe_economize) is fairly simple and could be improved at
| the cost of performance of the operation. Since most indexing
| expressions do not live longer than their parent object I suppose this
| will seldom be an issue, but it is still easily possible to construct
| an example where memory is wastefully kept allocated.
| However, the issue is easy to avoid if you know what you do, so I
| think it's OK. Comments are welcome.
|
| To get an idea of the performance speed up, here's a simplistic test
| (set n to suitable value):
|
| n = 2e7;
| x = 1/n * [1:n]; y = x.^2;
| # compute 2nd-order differences
| tic; d2y = diff (y, 2); toc
| # compute integral
| tic; int = trapz (x, y); toc
|
| with a recent tip, I get:
|
| Elapsed time is 0.697262 seconds.
| Elapsed time is 0.960276 seconds.
|
| with the new patch, I got:
|
| Elapsed time is 0.213054 seconds.
| Elapsed time is 0.497459 seconds.
|
| which means speedups by 228% and 93%.
| P.S. I picked some benefitted functions because measuring directly the
| performance speed-up of the affected indexing expressions is
| meaningless.
|
|
| Summary (2nd patch):
| This is a long-postponed cleanup of Array<T>, DiagArray2<T> and
| PermMatrix, to allow the private members of Array<T> to become really
| private,
| and reimplement DiagArray and PermMatrix without the dangerous abuse
| of Array<T>::dimensions (that has caused me quite a pain with the
| previous patch).
| This patch should probably be applied in any case, I'm putting it here
| because right now it's created on top of the 1st patch and I've only
| ran make check on the result.
|
| comments? OK to push?
I think these changes are fine. I see that you already pushed them,
which is OK, but maybe in the future when you ask for comments you
could wait a few days? :-)
I noticed the following incomplete comment in ov.cc:
// FIXME: This is a good place for pre-storage hooks, but the functions should
// probably be named differently. These functions will be called
Can you please explain (in a comment somewhere) what is meant by
"storable_value"?
Also, it might be good to add some explanation in Array.h about how
slice_data is supposed to be used. I'd like to try to avoid confusion
for people who might want to make modifications to the Array class in
the future.
Thanks,
jwe