octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #60818] delaunayn - 2D code path vectorization


From: Nicholas Jankowski
Subject: [Octave-bug-tracker] [bug #60818] delaunayn - 2D code path vectorization doesn't match nD algorithm
Date: Sat, 31 Jul 2021 23:06:49 -0400 (EDT)
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36

Follow-up Comment #42, bug #60818 (project octave):

ok. ran a bunch of speed and memory checks for different dimensions (up to 7D)
and  numbers of points(10-10000). There were definite points at high dimension
where the LU approach was faster than the direct laplace expansion.  However,
the memory footprint is 3-6x higher. 

looking at the code to see where the memory increases come in, for one
example, 4D and 10k points - peak mem LU-919MB, laplace-233MB :

after running main __delaunayn__: working mem 66MB, peak 224MB
after building eqs:  working 185MB, peak 587MB
after running LU:  working 250MB, peak 919MB

So, both building eqs and running LU are significant memory users, doubling
and quadrupling the other method's usage. While my machine hit memory limits
just running the __delaunayn__ for 7D @ 10k pts, i also ran out of mem with
the LU algorithm at 6D, 10k pts. 

timing:
I ran __delaunayn__ through the tests without the simplex volume check to get
a baseline, then ran the old loop and the two vectorized approaches.  did
enough loops and tic/tocs to get a good average, then figured out average
time/operation and subtracted out __delaunayn__'s time to compare just the
checks. 

here was the original loop (i didn't take time getting the really low memory
numbers):


old loop                                
est max memory
dim     10      100     1000    10000
2d                              55
3d                      50      71
4d                      57      153
5d                      98      655
6d                      322     3500
7d                      1572    OOM

old loop                                
check time, ms
dim  10        100        1000      10000
2d   0.412164  0.462299   0.44162   0.9745
3d   1.341657  38.14379   435.1904  4755.529
4d   1.309238  128.04121  1950.719  22246.64
5d   1.371815  426.4956   8552.84   121692.96
6d   0.861909  1427.137   46292.5   667997
7d   0.646422  4765.347   221883.5  OOM


the LU algorithm:


LU 
est max memory
dim     10      100     1000    10000
2d                              64
3d                      66      252
4d              54      123     933
5d              63      528     5880
6d              140     3100    OOM
7d              453     13000   OOM

lu                              
check time, ms                          
dim  10      100     1000     10000
2d   0.4971  0.9715  6.6968   69.4045
3d   0.4556  2.3249  36.732   636.099
4d   0.5227  15.078  336.72   4516.34
5d   0.5603  96.882  2586.3   35837.96
6d   0.5480  494.23   18004   OOM
7d   0.5369  2471.3  205843   OOM

lu                              
check/loop                              
dim  10    100   1000   10000
2d   1.20  2.10  15.16  71.22
3d   0.33  0.06  0.084  0.133
4d   0.39  0.11  0.172  0.203
5d   0.40  0.22  0.302  0.294
6d   0.63  0.34  0.388  OOM
7d   0.83  0.51  0.927  OOM


and the laplace expansion:

laplace                         
est max memory                          
dim     10      100     1000    10000
2d                              55
3d                              78
4d                              233
5d                      85      1100
6d                      590     8400
7d              105     3800    OOM

laplace                         
check time, ms                          
dim  10      100     1000      10000
2d   0.2019  0.2276  0.10983   9.0344
3d   0.3230  0.3180  1.2734    33.169
4d   0.9225  1.1493  28.029    533.24
5d   3.9620  13.853  418.83    9204.96
6d   23.342  298.25  18801.6   278337
7d   164.25  7893.5  656063.5  OOM

laplace                         
check/loop                              
dim  10     100    1000   10000
2d   0.489  0.490  0.248  -0.28
3d   0.240  0.008  0.002  0.006
4d   0.704  0.008  0.014  0.023
5d   2.888  0.032  0.048  0.074
6d   27.08  0.208  0.406  0.413
7d   254.0  1.656  2.956  OOM


ok, so comparing the two:


memory_lu/memory_laplace
dim  10  100  1000  10000
2d                  1.16
3d                  3.23
4d                  4.00
5d            6.21  5.34
6d            5.25
7d            4.31  3.42

time_lu / time_laplace                          
dim  10     100   1000   10000
2d   2.462  4.26  60.97  7.68
3d   1.410  7.30  28.84  19.1
4d   0.566  13.1  12.01  8.46
5d   0.141  6.99  6.175  3.89
6d   0.023  1.65  0.957  inf
7d   0.003  0.31  0.313  OOM



    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?60818>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]