octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #60364] Takes too much memory to call unique(p


From: Michael Leitner
Subject: [Octave-bug-tracker] [bug #60364] Takes too much memory to call unique(perms(...)), so here's a new function uniqueperms
Date: Sat, 10 Apr 2021 03:20:44 -0400 (EDT)
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0

Follow-up Comment #1, bug #60364 (project octave):

Good idea, but, first, I doubt that the octave maintainers will be willing to
introduce a new function into the main namespace. What seems more likely is if
you propose to add a flag "unique" to perms(), which would then have the
behaviour you propose -- at the moment, neither octave nor Matlab for that
matter recognize flags to perms(), so that should be no problem. And I can
even imagine that, if octave should introduce that, Matlab would follow,
because: in the strict mathematical sense, a permutation is not a reordering
of a specific (multi-)set, but a specific mapping of positions -- that is, it
is not yet applied. Thus, an example of a permutation of order 3 would be
"swap the second and third element and leave the first fixed", written as
(1)(32). Only if you write perms(1:n) for a given n you really get
permutations in the mathematical sense.  

What octave actually computes are multiset permutations applied to a specific
multiset, but without dropping equivalent multiset permutations. I agree that
offhand I do not see a situation where I would want those. What you propose
are multiset permutations in the strict sense. Note that this is also what
Wolfram alpha (and thus presumably Mathematica compute): if you enter
Permutations[{1, 2, 2}], you get a three-, not six-element list of
permutations. So I agree that your contribution is helpful.

As to the performance: I think that you can get close to the performance of
perms (that means, in practical time you can generate permutations that fill
memory) if you do the following: first, compute only the multiset
permutations, not the application to a specific multiset. Thus, compute
indices, and you have to use here only uint8 (saving memory), because you will
never have more than 256 distinct elements in the multiset (because of
memory). Then, do it iteratively: say that n(i) is the frequency that element
i (sorted) appears in the multiset. You first compute the multiset
permutations of the sub-multiset consisting only of this first element. This
is of course ones(n(1),1). Then you iteratively compute the multiset
permutations of the sub-multiset consisting of the first i unique elements
from those consisting of the first i-1. You can do so by index arithmetics and
cumsum(nchoosek(1:1+sum(n(1:i-1)),n(i)),2), which gives you the positions
where you expand the input multiset permutation and put entry i. And of course
you do that by a single assignment on the matrix of permutation vectors. This
should need only a single for loop (over the unique entries in your input
multiset), and in addition the for loop in nchoosek(). I think that this
should be performant. And at the end of course you use your multiset
permutation as input into the unique values of your input multiset.



    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?60364>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]