octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: new strsplit function


From: Ben Abbott
Subject: Re: new strsplit function
Date: Mon, 01 Apr 2013 08:18:12 -0400

On Apr 1, 2013, at 2:06 AM, Rik wrote:

> On 03/31/2013 05:38 PM, Ben Abbott wrote:
>> Rik,
>> 
>> I've pushed a changeset that includes a note in the NEWS file.
>> 
>>      http://hg.savannah.gnu.org/hgweb/octave/rev/1de4ec2a856d
>> 
>> I have not run any benchmarks.  If it is any help, the new version is based 
>> on regexp().
> Ben,
> 
> Unfortunately regexp is slow compared to operations on native char types.
> 
> I did a quick benchmark and I do think this is likely to be an issue.  The
> strread function is now 30X slower.
> 
> Benchmark Code:
> cd scripts/string
> tic; A = textread ("strtok.m", "%s"); toc
> 
> New results:
> 1.502 +/- .004
> 
> Old results:
> 0.0455 +/- .0006
> 
> Slowdown = 1.502 / .0455 = 33
> 
> strtok.m is a small file, 7.2KB, so a largish real data file is going to
> parse very slowly.
> 
>> 
>> If we are to add another script, perhaps cstrsplit() is a good names (we did 
>> that for strcat some time ago).  Where "c" is for (c)onventional.
> That would be a good idea.
> 
> Also, shouldn't the replacement of existing instances have been
> 
> strsplit (str, del, "collapsedelimiters", false)
> 
> rather than
> 
> strsplit (str, del, false)
> 
> ???
> 
> I don't see that the Matlab function accepts a third argument--only
> PROP/VALUE pairs.
> 
> Cheers,
> Rik

Both

        strsplit (str, del, "collapsedelimiters", false)

and

        strsplit (str, del, false)

should give the same result.  Treating the third argument in this way was part 
of our original implementation, and I kept it for backward compatibility.

Regarding the slow down, another option is to add a new delimiter type, say 
"conventional"? (I'm embarrassed that I hadn't thought of that approach before)

It should also be possible to include support for 2D character input, but I'll 
need to put some work into that before being sure.

Ben



reply via email to

[Prev in Thread] Current Thread [Next in Thread]