[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: a very slow problem of replacing string
From: |
Chet Ramey |
Subject: |
Re: a very slow problem of replacing string |
Date: |
Fri, 24 Sep 2010 09:07:21 -0400 |
User-agent: |
Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2.9) Gecko/20100915 Thunderbird/3.1.4 |
On 9/24/10 8:19 AM, Greg Wooledge wrote:
> On Thu, Sep 23, 2010 at 10:12:28PM +0900, sky wrote:
>> #
>> # prepare 1000 strings of 6 digits
>> #
>> TEST_LIST=`seq 100100 100 200000`
>> echo $TEST_LIST | wc
>
> Actually, this is one gigantic string, not 1000 strings.
>
>> #
>> # delete "150000"
>> #
>> T0=$SECONDS
>> A=${TEST_LIST//150000}
>> T1=$SECONDS
>> B=`echo $TEST_LIST | sed s/150000//g`
>> T2=$SECONDS
>
> Yes, it's known that operations on very large strings in bash can take
> a long time. (Chet may be able to address that problem; I can't.)
The problem involves performing pattern substitution on very
large strings in an environment that the locale indicates supports
multibyte characters.
The solution is to bound the number of comparisons the matcher has
to do, which reduces the number of multibyte string conversions
and wide character comparisons necessary. I've done a lot of work on
this, and the changes will be in bash-4.2.
Chet
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRU chet@case.edu http://cnswww.cns.cwru.edu/~chet/