Re: FYI: nth

octave-maintainers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: FYI: nth_element

From:	Jaroslav Hajek
Subject:	Re: FYI: nth_element
Date:	Thu, 15 Oct 2009 14:31:22 +0200

On Thu, Oct 15, 2009 at 2:22 PM, Shai Ayal <address@hidden> wrote:
> On Thu, Oct 15, 2009 at 12:36 PM, Jaroslav Hajek <address@hidden> wrote:
>> On Thu, Oct 15, 2009 at 9:56 AM, Shai Ayal <address@hidden> wrote:
>>> On Thu, Oct 15, 2009 at 9:15 AM, Jaroslav Hajek <address@hidden> wrote:
>>>> hi all,
>>>>
>>>> Octave has just got a new function: nth_element:
>>>> http://hg.savannah.gnu.org/hgweb/octave/rev/aea3a3a950e1
>>>>
>>>> the docstring:
>>>>
>>>>  -- Built-in Function:  nth_element (X, N)
>>>>  -- Built-in Function:  nth_element (X, N, DIM)
>>>>     Select the n-th smallest element of a vector, using the ordering
>>>>     defined by `sort'.  In other words, the result is equivalent to
>>>>     `sort(X)(N)'.  N can also be a contiguous range, either ascending
>>>>     `l:u' or descending `u:-1:l', in which case a range of elements is
>>>>     returned.  If X is an array, `nth_element' operates along the
>>>>     dimension defined by DIM, or the first non-singleton dimension if
>>>>     DIM is not given.
>>>>
>>>>     nth_element encapsulates the C++ STL algorithms nth_element and
>>>>     partial_sort.  On average, the complexity of the operation is
>>>>     O(M*log(K)), where `M = size(X, DIM)' and `K = length (N)'.  This
>>>>     function is intended for cases where the ratio K/M is small;
>>>>     otherwise, it may be better to use `sort'.
>>>>
>>>>     See also: sort, min, max
>>>>
>>>> In short, it allows extracting a small portion of sort(X) without
>>>> actually doing the full sort.
>>>> This is sometimes useful for statistics when computing quantiles and
>>>> the like... statisticians can surely tell better.
>>>>
>>>> Maybe it can be used to boost some existing functions. As an example,
>>>> it can be readily used to speed up the "median" function:
>>>> http://hg.savannah.gnu.org/hgweb/octave/rev/b7b89061bd0e
>>>>
>>> It can also help hist quite a lot.
>>>
>>
>> I don't see how. Can you clarify?
>
> You are right -- I was confused. hist needs the number of elements
> smaller than a given value (and uses sort to find out), while
> nth_element gives the value of the nth element.
> Writing hist in c++ is not hard and would really help since it uses
> (M+K)log(M+K) sort in order to avoid a M*K loop just because looping
> is slow in the scripting language. One of these days I'll get down to
> doing that ...
>
> Shai
>

Why not just use a binary search, via `lookup'? That would be
O(M*log(K) + K) (M the number of data and K the number of bins).
Look what histc does... maybe hist can even simply call histc.

hth

-- 
RNDr. Jaroslav Hajek
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz

[Prev in Thread]

Current Thread

[Next in Thread]

FYI: nth_element, Jaroslav Hajek, 2009/10/15
- Re: FYI: nth_element, Shai Ayal, 2009/10/15
  - Re: FYI: nth_element, Jaroslav Hajek, 2009/10/15
    - Re: FYI: nth_element, Shai Ayal, 2009/10/15
    - Re: FYI: nth_element, Jaroslav Hajek <=
    - Re: FYI: nth_element, Shai Ayal, 2009/10/15
- Re: FYI: nth_element, Jordi Gutiérrez Hermoso, 2009/10/17
  - Re: FYI: nth_element, Jaroslav Hajek, 2009/10/17

Prev by Date: Re: FYI: nth_element
Next by Date: Re: FYI: nth_element
Previous by thread: Re: FYI: nth_element
Next by thread: Re: FYI: nth_element
Index(es):
- Date
- Thread