Re: strread.m

octave-maintainers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: strread.m

From:	Philip Nienhuis
Subject:	Re: strread.m
Date:	Tue, 02 Aug 2011 23:20:21 +0200
User-agent:	Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.11) Gecko/20100701 SeaMonkey/2.0.6

John W. Eaton wrote:

On  2-Aug-2011, PhilipNienhuis wrote:

| Lately I've been working on textscan / textread / strread (aided by Rik), so
| I can give a little background info.
|
| The real work for textscan is done in strread.m
| Given the current state of strread.m in the dev sources I think you wouldn't
| need to spend time on format conversion specifiers, at least for now.

I'm happy to not have to do anything, but it seems to me that a script
to do this job will be slow, especially for large files.

Not only slow; as I outlined in bug #33875, the way strread is set upworks well for simple and neatly aligned files but becomes close to aheadache for the complicated ones (like the %*Ns, and especially %c, %gand %[..] / %[^..] specifiers as these can "cross" delimiters).

(BTW in my previous post I forgot to mention %c as missing completely)

I fear that practical limitations aren't so much dictated by file sizebut rather by file complexity. Given strread's inner working, especially%g and %[..] %[^..] processing will need a few vital assumptions abouthow the file sticks together that for some cases can turn out to be deadwrong.

Obviously a binary strread will be much better, faster, more predictableand less complex, but for now we just have to make do with what isavailable.

But for not too complicated files it does work quite well.

Maybe I don't remember correctly, but I thought that previous versions
of strread converted format specifiers to something that could be used
by scanf, then called scanf, and that this approach would not work for
a number of the format specifiers that are needed by strread.  But
that doesn't seem to be the way strread currently works, so maybe you
can solve all the problems without needing a modified scanf-style
function in C++.


I can't tell.

But I'd like to know if directly reading (by some sscanf or so) e.g., anint8 from a string is superior (in the sense of conversion errors) tocasting from double to int8. (and int16, int64, unsigned int8, ...)

Because if not, the bit width specifiers can be straightforwardlyimplemented.


Philip

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Binary distributions (was: Re: Release goals for 3.6), (continued)
- Release goals for 3.6, John W. Eaton, 2011/08/02
  - Re: Release goals for 3.6, Daniel Kraft, 2011/08/02
    - Re: Release goals for 3.6, Jordi Gutiérrez Hermoso, 2011/08/02
    - Re: Release goals for 3.6, John W. Eaton, 2011/08/02
    - Re: Release goals for 3.6, Daniel Kraft, 2011/08/03
  - Re: Release goals for 3.6, PhilipNienhuis, 2011/08/02
    - strread.m (was: Re: Release goals for 3.6), John W. Eaton, 2011/08/02
    - Re: strread.m, Philip Nienhuis <=
    - Re: strread.m, John W. Eaton, 2011/08/02
    - Re: strread.m, Philip Nienhuis, 2011/08/02
    - Re: strread.m, John W. Eaton, 2011/08/02
    - Re: strread.m, Philip Nienhuis, 2011/08/03
    - Re: strread.m, John W. Eaton, 2011/08/03
    - Re: strread.m, Philip Nienhuis, 2011/08/03
    - Re: strread.m, John W. Eaton, 2011/08/04
    - xtextscan [WAS: Re: strread.m], Philip Nienhuis, 2011/08/04
    - Re: strread.m, Ben Abbott, 2011/08/04
    - Re: strread.m, Ben Abbott, 2011/08/02

Prev by Date: Re: Binary distributions (was: Re: Release goals for 3.6)
Next by Date: Re: Binary distributions
Previous by thread: strread.m (was: Re: Release goals for 3.6)
Next by thread: Re: strread.m
Index(es):
- Date
- Thread