[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: 3D versus 2D Indexing and the Speed Thereof
From: |
John W. Eaton |
Subject: |
Re: 3D versus 2D Indexing and the Speed Thereof |
Date: |
Mon, 9 Apr 2007 20:01:18 -0400 |
On 6-Apr-2007, Luis F. Ortiz wrote:
| 1) One of the methods patched is assign2(). It has the following
| signature:
|
| template <class LT, class RT>
| int
| assign2 (Array<LT>& lhs, const Array<RT>& rhs, const LT& rfv)
|
| This seems to me to be an attempt to support type conversions during the
| assignment.
| But the code I wrote only works for the case where RT and LT are the
| same type. What
| is the right way to handle this? Can it be done at
| runtime/compiletime?
| Is this ever instantiated with LT != RT?
I thought about this a bit more and came up with the following as a
possible solution.
Your copy strips function is:
template <class T>
void
Array<T>::copy_strips (const Array<T>& source,
octave_idx_type dest_offset,
octave_idx_type source_offset,
octave_idx_type element_count,
octave_idx_type block_count,
octave_idx_type source_stride,
octave_idx_type dest_stride)
{
T *raw_source, *raw_dest;
// First do one element to force the copy-on-write
elem(dest_offset) = source.elem (source_offset);
raw_source = &(source.rep->data[source_offset]);
raw_dest = &(rep->data[dest_offset] );
for (octave_idx_type i = 0; i < block_count; i++)
{
memcpy (raw_dest, raw_source, sizeof(T)*element_count);
raw_source += source_stride;
raw_dest += dest_stride;
}
}
I think it should maybe be done with something like this (pushing the
actual work down to the Array<T>::ArrayRep level):
template <class T> class Array
{
...
class ArrayRep
{
...
// Generic mixed-type copy-strips function:
template <class U>
void
copy_strips (const Array<U>& source,
octave_idx_type dest_offset,
octave_idx_type source_offset,
octave_idx_type element_count,
octave_idx_type block_count,
octave_idx_type source_stride,
octave_idx_type dest_stride)
{
make_unique ();
const U *source_data = source.data ();
const U *raw_source = &source_data[source_offset];
T *raw_dest = &data[dest_offset];
for (octave_idx_type i = 0; i < block_count; i++)
{
for (octave_idx_type j = 0; j < element_count; j++);
raw_dest[i] = raw_source[i];
raw_source += source_stride;
raw_dest += dest_stride;
}
}
// Partial specialization of mixed-type copy-strips function for
// case of LHS type == RHS type (only really necessary if it
// actually makes things faster to use memcpy):
template <class T>
void
copy_strips (const Array<T>& source,
octave_idx_type dest_offset,
octave_idx_type source_offset,
octave_idx_type element_count,
octave_idx_type block_count,
octave_idx_type source_stride,
octave_idx_type dest_stride)
{
make_unique ();
const T *source_data = source.data ();
const T *raw_source = &source_data[source_offset];
T *raw_dest = &data[dest_offset];
for (octave_idx_type i = 0; i < block_count; i++)
{
memcpy (raw_dest, raw_source, sizeof(T)*element_count);
raw_source += source_stride;
raw_dest += dest_stride;
}
}
...
}; /* class ArrayRep */
...
template <class U>
void
copy_strips (const Array<U>& source,
octave_idx_type dest_offset,
octave_idx_type source_offset,
octave_idx_type element_count,
octave_idx_type block_count,
octave_idx_type source_stride,
octave_idx_type dest_stride)
{
rep->copy_strips (source, dest_offset, source_offset,
element_count, block_count, source_stride,
dest_stride);
}
...
}; /* class Array */
I haven't actually tried this code yet with the Array class, but I
think it should work. Here is a very simple and complete example of
the same kind of thing that appears to work correctly for me with g++
3.4 and 4.1:
#include <iostream>
template <class T> struct foo
{
struct foo_rep
{
template <class U>
void doit (U) { std::cerr << "mixed" << std::endl; }
void doit (T) { std::cerr << "same" << std::endl; }
};
foo (void) : rep (new foo_rep ()) { }
template <class U>
void doit (U x) { rep->doit (x); }
foo_rep *rep;
};
int
main (void)
{
double y = 0;
int z = 0;
foo<double> x;
x.doit (y);
x.doit (z);
return 0;
}
jwe
Re: 3D versus 2D Indexing and the Speed Thereof,
John W. Eaton <=