bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bug#22357: grep -f not only huge memory usage, but also huge time co


From: Norihiro Tanaka
Subject: Re: bug#22357: grep -f not only huge memory usage, but also huge time cost
Date: Wed, 21 Dec 2016 07:10:17 +0900

On Mon, 19 Dec 2016 15:38:12 -0800
Paul Eggert <address@hidden> wrote:

> but the old 'replace' called 'delete' up to N times,

Yes, but constraint == 0 does not happen mostly, so in delete() in
"while" does not pass normally.

> Anyway, I verified that the change improved performance on the test
> case with -f /usr/share/dict/words (not as much as your earlier patch
> improved things! just 5% to 10%, if I recall correctly), so even if
> I'm wrong about the big-O improvement the change should still be a
> win.

Although I do not look at it, I agree your change, as I believe that the
change does not cause slowdown for a case extreamly, 

> > 1. dfa is not optimize parsed patterns.
> >      For example, dfa does not replace pattern 'a \|a \|a \| ... \|a \|b'
> >      to 'a \|b'.
> >
> >      I tried it several times, have not succeeded it yet.
> 
> That sounds a bit like REduce:
> 
> Valgenti VC, Kim MS, Oh S-I, Lee I. REduce: removing redundancy from
> regular expression matching in network security. ICCCN 2015.
> http://dx.doi.org/10.1109/ICCCN.2015.7288457
> 
> >
> >   2. grep matcher compiles regex always to check syntax of a pattern,
> >      even when it is not used in searching.
> >
> >      I also tried it, have not succeeded it yet.
> 
> Yes, that sounds worthy too. Also, a lot of work has been done in
> regular-expression matches over the past several years, and if
> someone has the time it'd be nice to see whether GNU grep can use
> this stuff on modern architectures. For example:
> 
> Cameron RD, Medforth N, Lin D, Denis D, Sumner WN. Bitwise data
> parallelism with LLVM: the icGrep case study. ICA3PP 2015. LNCS 9529.
> http://dx.doi.org/10.1007/978-3-319-27122-4_26
> 
> Valgenti VC, Chhugani J, Sun Y, Satish N, Kim MS, Kim C, Dubey P.
> GPP-grep: high-speed regular expression processing engine on general
> purpose processors. RAID 2012. LNCS 7462. 
> http://dx.doi.org/10.1007/978-3-642-33338-5_17

Special thanks, I will keep them in the back of my head.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]