Re: dataframe dereferencing

octave-maintainers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: dataframe dereferencing

From:	Jaroslav Hajek
Subject:	Re: dataframe dereferencing
Date:	Fri, 3 Sep 2010 08:55:29 +0200

On Thu, Sep 2, 2010 at 9:31 PM, Judd Storrs <address@hidden> wrote:
> On Thu, Sep 2, 2010 at 3:04 PM, Jaroslav Hajek <address@hidden> wrote:
>>
>> while you have every right to naively expect this, understand that for
>> cell(x(1:3, 1:2)) the inner expression must result in some kind of
>> intermediary object (e.g. a sub-dataframe) which is then converted to
>> cell, while x.cell(1:3, 1:2) may be optimized so as to extract the
>> proper portion of data to cell directly. Similarly for matrix.
>
> I see your point--you think it's a performance issue, but I think it is
> incorrect to assume that subsetting a dataframe is necessarily
> inefficient. Really, that's a question of implementation not semantics. I
> don't think that linguistic novelty is a good approach to optimization. Two
> competing semantic models is a bad thing.

Competing? Oh no, these would be just happily co-existing :) Besides,
for a dataframe df there are actually two cell conversions, df.cell
and df.as.cell, and you need to distinguish between them.

> If performance is a problem,
> optimize later.

As every dogmatic statement, this is not always true. Optimization
possibilities are always design-dependent to some extent.
Certain optimizations are simply impossible later if not born in mind
from the very start.

> Personally, I think octave's internal function dispatch is
> always going to be faster than a cobbled-together m-file-based dispatch.

The dispatch is not the problem, the intermediate object is.

> A
> different optimization would be to make dataframe perform lazy
> sub-referencing--e.g. a subframe is a view of the original frame (which
> could also have memory advantages).
>>

Lazy indexing is cool, but there is a number of problems implied...

>> However, I see no reason why dataframe couldn't support conversion to
>> cell through cell (dataframe) as well.
>
> Well, I don't think we want to go the perl route if we can avoid it...
>>

Huh?

>> Before you overload {} or suggest doing it, make sure you understand
>> the associated cs-list & numel issues.
>
> You're going to have to point me somewhere on this one. I'm proposing it
> anyway because it's semantically correct.
>

Expressions like A{I} and A(I).B may generate a cs-list. This is
especially important in assignment, where the cs-list length needs to
be evaluated *prior* to the right hand side (and hence prior to the
subsasgn call).

regards

-- 
RNDr. Jaroslav Hajek, PhD
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz

[Prev in Thread]

Current Thread

[Next in Thread]

dataframe dereferencing, CdeMills, 2010/09/02
- Re: dataframe dereferencing, Jaroslav Hajek, 2010/09/02
- Re: dataframe dereferencing, Judd Storrs, 2010/09/02
  - Re: dataframe dereferencing, CdeMills, 2010/09/02
    - Re: dataframe dereferencing, Judd Storrs, 2010/09/02
  - Re: dataframe dereferencing, Jaroslav Hajek, 2010/09/02
    - Re: dataframe dereferencing, Judd Storrs, 2010/09/02
    - Re: dataframe dereferencing, Jaroslav Hajek <=
    - Re: dataframe dereferencing, CdeMills, 2010/09/03
    - Re: dataframe dereferencing, Jaroslav Hajek, 2010/09/03
    - Re: dataframe dereferencing, Judd Storrs, 2010/09/03
    - Re: dataframe dereferencing, Judd Storrs, 2010/09/03
    - Re: dataframe dereferencing, Jaroslav Hajek, 2010/09/03
    - Re: dataframe dereferencing, Judd Storrs, 2010/09/03
    - Re: dataframe dereferencing, Jaroslav Hajek, 2010/09/04
    - Re: dataframe dereferencing, Judd Storrs, 2010/09/04
    - Re: dataframe dereferencing, Jaroslav Hajek, 2010/09/04
    - Re: dataframe dereferencing, CdeMills, 2010/09/06

Prev by Date: Re: wait_for_file ??
Next by Date: Re: wait_for_file ??
Previous by thread: Re: dataframe dereferencing
Next by thread: Re: dataframe dereferencing
Index(es):
- Date
- Thread