[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: bug#22357: grep -f not only huge memory usage, but also huge time co
From: |
Norihiro Tanaka |
Subject: |
Re: bug#22357: grep -f not only huge memory usage, but also huge time cost |
Date: |
Wed, 21 Dec 2016 07:10:17 +0900 |
On Mon, 19 Dec 2016 15:38:12 -0800
Paul Eggert <address@hidden> wrote:
> but the old 'replace' called 'delete' up to N times,
Yes, but constraint == 0 does not happen mostly, so in delete() in
"while" does not pass normally.
> Anyway, I verified that the change improved performance on the test
> case with -f /usr/share/dict/words (not as much as your earlier patch
> improved things! just 5% to 10%, if I recall correctly), so even if
> I'm wrong about the big-O improvement the change should still be a
> win.
Although I do not look at it, I agree your change, as I believe that the
change does not cause slowdown for a case extreamly,
> > 1. dfa is not optimize parsed patterns.
> > For example, dfa does not replace pattern 'a \|a \|a \| ... \|a \|b'
> > to 'a \|b'.
> >
> > I tried it several times, have not succeeded it yet.
>
> That sounds a bit like REduce:
>
> Valgenti VC, Kim MS, Oh S-I, Lee I. REduce: removing redundancy from
> regular expression matching in network security. ICCCN 2015.
> http://dx.doi.org/10.1109/ICCCN.2015.7288457
>
> >
> > 2. grep matcher compiles regex always to check syntax of a pattern,
> > even when it is not used in searching.
> >
> > I also tried it, have not succeeded it yet.
>
> Yes, that sounds worthy too. Also, a lot of work has been done in
> regular-expression matches over the past several years, and if
> someone has the time it'd be nice to see whether GNU grep can use
> this stuff on modern architectures. For example:
>
> Cameron RD, Medforth N, Lin D, Denis D, Sumner WN. Bitwise data
> parallelism with LLVM: the icGrep case study. ICA3PP 2015. LNCS 9529.
> http://dx.doi.org/10.1007/978-3-319-27122-4_26
>
> Valgenti VC, Chhugani J, Sun Y, Satish N, Kim MS, Kim C, Dubey P.
> GPP-grep: high-speed regular expression processing engine on general
> purpose processors. RAID 2012. LNCS 7462.
> http://dx.doi.org/10.1007/978-3-642-33338-5_17
Special thanks, I will keep them in the back of my head.
- Re: bug#22357: grep -f not only huge memory usage, but also huge time cost, (continued)
- Re: bug#22357: grep -f not only huge memory usage, but also huge time cost, Norihiro Tanaka, 2016/12/11
- Re: bug#22357: grep -f not only huge memory usage, but also huge time cost, Bruno Haible, 2016/12/11
- Re: bug#22357: grep -f not only huge memory usage, but also huge time cost, arnold, 2016/12/11
- Re: bug#22357: grep -f not only huge memory usage, but also huge time cost, Paul Eggert, 2016/12/11
- Re: bug#22357: grep -f not only huge memory usage, but also huge time cost, Bruno Haible, 2016/12/12
- Re: bug#22357: grep -f not only huge memory usage, but also huge time cost, Paul Eggert, 2016/12/14
- Re: bug#22357: grep -f not only huge memory usage, but also huge time cost, Norihiro Tanaka, 2016/12/17
- Re: bug#22357: grep -f not only huge memory usage, but also huge time cost, Paul Eggert, 2016/12/19
- Re: bug#22357: grep -f not only huge memory usage, but also huge time cost, Norihiro Tanaka, 2016/12/19
- Re: bug#22357: grep -f not only huge memory usage, but also huge time cost, Paul Eggert, 2016/12/19
- Re: bug#22357: grep -f not only huge memory usage, but also huge time cost,
Norihiro Tanaka <=
- Re: bug#22357: grep -f not only huge memory usage, but also huge time cost, Norihiro Tanaka, 2016/12/12
- Re: bug#22357: grep -f not only huge memory usage, but also huge time cost, Trevor Cordes, 2016/12/12
- Re: bug#22357: grep -f not only huge memory usage, but also huge time cost, Paul Eggert, 2016/12/12
- Re: bug#22357: grep -f not only huge memory usage, but also huge time cost, Trevor Cordes, 2016/12/12
- Re: bug#22357: grep -f not only huge memory usage, but also huge time cost, Paul Eggert, 2016/12/12
- Re: bug#22357: grep -f not only huge memory usage, but also huge time cost, L.A. Walsh, 2016/12/15