octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #63791] movfun - high intermediate memory use


From: Rik
Subject: [Octave-bug-tracker] [bug #63791] movfun - high intermediate memory use for moderate data input resulting in out of memory errors
Date: Sun, 12 Feb 2023 18:19:47 -0500 (EST)

Update of bug #63791 (project octave):

                  Status:               Confirmed => Patch Submitted        

    _______________________________________________________

Follow-up Comment #22:

As Nicholas noted, this is a design choice because Octave does not have a JIT
compiler for loops.  Because loops are slow, they are avoided at all costs. 
In this case, we trade memory for speed by creating an array with all of the
necessary data for the window size requested, and then use a vectorized
function such as mean() to reduce that back to the size of the data itself. 
If N is the number of elements along the requested dimension (size (x, DIM))
and WLEN is the window length then the amount of memory required is


WLEN * (N - WLEN) * sizeof (DATATYPE)


For the example here where WLEN=513, N=1e6, class=double (8B), the required
memory is 4.1GB.  That's a big chunk of data to malloc().

One approach would be to rewrite the movfun.m function in C++ so that loops
are not an issue.  This is, I believe, what Matlab has done as a function like
movmean is a built-in function for them rather than an m-file.  Another option
is to reduce the size of the memory allocations by working in chunks.  Here is
a patch for movfun.m which does that


diff --git a/scripts/signal/movfun.m b/scripts/signal/movfun.m
--- a/scripts/signal/movfun.m
+++ b/scripts/signal/movfun.m
@@ -296,7 +296,23 @@ function y = movfun_oncol (fcn, x, wlen,
   y = NA (N, odim);
 
   ## Process center part
-  y(C,:) = fcn (x(slcidx));
+  try
+    y(C,:) = fcn (x(slcidx));
+  catch err
+    ## Operation failed, likely because of out-of-memory error for
"x(slcidx)".
+    if (! strcmp (err.identifier, "Octave:bad-alloc"))
+      rethrow (err);
+    endif
+
+    ## Try divide and conquer approach with smaller slices of data.
+    ## For loops are slow, so don't try too hard with this approach.
+    Nslices = 8;  # configurable
+    idx1 = fix (linspace (1, numel (C), Nslices));
+    idx2 = fix (linspace (1, columns (slcidx), Nslices));
+    for i = 1 : Nslices-1
+      y(C(idx1(i):idx1(i+1)),:) = fcn (x(slcidx(:, idx2(i):idx2(i+1))));
+    endfor
+  end_try_catch
 
   ## Process boundaries
   if (! isempty (Cpre))


As you can see, if the memory allocation fails, it attempts to redo the
allocation in smaller chunks and assemble everything with a for loop. 
Interestingly, for the example in this bug report my runtime was ~8 seconds
with the for loop and ~13 seconds with the original code.  I suppose if one
were getting really fancy one could use the output from memory() to compute a
reasonable chunk size given available free memory.

I'm attaching the patch as well as the modified movfun.m if anyone wants to
test this.


(file #54352, file #54353)

    _______________________________________________________

Additional Item Attachment:

File name: patch.63791                    Size:0 KB
    <https://file.savannah.gnu.org/file/patch.63791?file_id=54352>

File name: movfun.m                       Size:22 KB
    <https://file.savannah.gnu.org/file/movfun.m?file_id=54353>



    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?63791>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]