Re: [Octave] Improving Octave for large files

On Wed, Nov 11, 2009 at 12:51 AM, David Bateman <address@hidden> wrote:

Christian Brædstrup wrote:

I have
been looking in the PROJECTS file in the source and wanted to hear if anyone
is working on the problem with large files that Juhana K. Kouhia talks about
(I couldn't find any code in the src/load-save.cc file to indicate that)? I
have a friend working on the TPIE library (
http://www.madalgo.au.dk/Trac-tpie/) and thought it would fit nicely into
the octave source. Does anyone have any concerns about including the TPIE
library or any comments about how best to add the functionality.

That idea was proposed in 1994

http://old.nabble.com/Octave-question-to9226868.html#a9226868

and things have perhaps moved a bit since. I'd say the large file issues now are two fold

1) Data sets with more elements that 2^31 due to 64-bit indexing. The ability to handle such datasets is in Octave but poorly tested. The loading and saving of files for such datasets is not however tested though the HDF5 formats should be able to handle this
2) Large data sets tend to go hand in hand with large computational problems, and the parallelisation and distribution of a database across many nodes could be improved

I'm sorry I don't know really what TPIE was to offer, but if as I suspect it defers reading data from a file till its needed. In this case to integrate TPIE probably means implementing user types from the ground up (right down to a reimplementation of the Array class. Is the benefit worth the cost?

D.

--

David Bateman address@hidden
35 rue Gambetta +33 1 46 04 02 18 (Home)
92100 Boulogne-Billancourt FRANCE +33 6 72 01 06 33 (Mob)

From:	Christian Brædstrup
Subject:	Re: [Octave] Improving Octave for large files
Date:	Wed, 11 Nov 2009 13:38:39 +0100