octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: strread.m


From: Philip Nienhuis
Subject: Re: strread.m
Date: Tue, 02 Aug 2011 23:20:21 +0200
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.11) Gecko/20100701 SeaMonkey/2.0.6

John W. Eaton wrote:
On  2-Aug-2011, PhilipNienhuis wrote:

| Lately I've been working on textscan / textread / strread (aided by Rik), so
| I can give a little background info.
|
| The real work for textscan is done in strread.m
| Given the current state of strread.m in the dev sources I think you wouldn't
| need to spend time on format conversion specifiers, at least for now.

I'm happy to not have to do anything, but it seems to me that a script
to do this job will be slow, especially for large files.

Not only slow; as I outlined in bug #33875, the way strread is set up works well for simple and neatly aligned files but becomes close to a headache for the complicated ones (like the %*Ns, and especially %c, %g and %[..] / %[^..] specifiers as these can "cross" delimiters).
(BTW in my previous post I forgot to mention %c as missing completely)

I fear that practical limitations aren't so much dictated by file size but rather by file complexity. Given strread's inner working, especially %g and %[..] %[^..] processing will need a few vital assumptions about how the file sticks together that for some cases can turn out to be dead wrong.

Obviously a binary strread will be much better, faster, more predictable and less complex, but for now we just have to make do with what is available.
But for not too complicated files it does work quite well.


Maybe I don't remember correctly, but I thought that previous versions
of strread converted format specifiers to something that could be used
by scanf, then called scanf, and that this approach would not work for
a number of the format specifiers that are needed by strread.  But
that doesn't seem to be the way strread currently works, so maybe you
can solve all the problems without needing a modified scanf-style
function in C++.

I can't tell.

But I'd like to know if directly reading (by some sscanf or so) e.g., an int8 from a string is superior (in the sense of conversion errors) to casting from double to int8. (and int16, int64, unsigned int8, ...)

Because if not, the bit width specifiers can be straightforwardly implemented.

Philip


reply via email to

[Prev in Thread] Current Thread [Next in Thread]