[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: will we ever have zero width assertions in regexps?
From: |
Stefan Monnier |
Subject: |
Re: will we ever have zero width assertions in regexps? |
Date: |
Fri, 28 Jan 2011 21:51:55 -0500 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.0.50 (gnu/linux) |
>> To get rid of the occasional pathological case where matching takes
>> forever and Emacs appears to be frozen. Programmers who are used to
>> backtracking matchers will usually intuitively stay away from regexps
>> that can show such behaviors, but not all programmers do, and even if
>> you're careful there are cases that are hard to avoid.
> Did you try it with Perl recently (last 10 years or so)?
No, but neither have I bumped into pathological cases in Perl before
that (when I did use it).
> As I said, I put some optimizations which in most (AFAIK) practical
> senses remove such pathologies. (The underlying problems remain; the
> optimizations are only "heuristic"; but one needs to be extra
> inventive to circumvent the optimizations.)
A typical case could look something like "foo *(.*?) *bar". when
matching "foo ..<many space>.. baZ".
Emacs's regexp engine is not very clever and doesn't do much in terms of
avoiding backtracking (it mostly takes care of <foo>*<bar> when <foo>
can only match a single char and <bar> can only start with a char that's
not matched by <foo>), but I can't think of too many ways to handle the
above one efficiently within a "backtracking regexp matcher" framework.
>> Another minor reason is that it can be handy to have an incremental
>> matching primitive, so you can match over a long string one chunk at
>> a time. I'm not sure how often this would be useful, but I've come
>> across a few cases where it seemed like it could be put to good use
>> (tho, for lack of experience with it, I can't sweat that it would turn
>> out to be a good idea).
> Do not know what you mean by this...
Basically, provide a primitive like (match-string RE STRING LIMIT) that
can not only say "matched between START and END", but also "reached
LIMIT within yet finding a match, here's the suspended SEARCH-STATE at
LIMIT", so you can later resume the search starting at LIMIT by passing
that state.
Stefan
- will we ever have zero width assertions in regexps?, Le Wang, 2011/01/26
- Re: will we ever have zero width assertions in regexps?, Stefan Monnier, 2011/01/26
- Re: will we ever have zero width assertions in regexps?, Le Wang, 2011/01/26
- Message not available
- Re: will we ever have zero width assertions in regexps?, Stefan Monnier, 2011/01/26
- Re: will we ever have zero width assertions in regexps?, Ilya Zakharevich, 2011/01/27
- Re: will we ever have zero width assertions in regexps?, Stefan Monnier, 2011/01/27
- Re: will we ever have zero width assertions in regexps?, Ilya Zakharevich, 2011/01/28
- Re: will we ever have zero width assertions in regexps?,
Stefan Monnier <=
- Re: will we ever have zero width assertions in regexps?, Ilya Zakharevich, 2011/01/29
- Re: will we ever have zero width assertions in regexps?, Stefan Monnier, 2011/01/31
- Re: will we ever have zero width assertions in regexps?, Ilya Zakharevich, 2011/01/31
- Re: will we ever have zero width assertions in regexps?, Stefan Monnier, 2011/01/31