[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#18577: Regexp I-search: [(error Stack overflow in regexp matcher)]
From: |
Stefan Monnier |
Subject: |
bug#18577: Regexp I-search: [(error Stack overflow in regexp matcher)] |
Date: |
Sun, 28 Sep 2014 13:35:30 -0400 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) |
>> Is this a defect in my regexp or in the regexp engine?
> It is fundamental to the way regexp matching works.
To clarify: it is fundamental to the way *our* regexp engine works.
As long as the regexp doesn't use backrefs, it can be matched
efficiently, without backtracking. Of course using \(..\) (as opposed
to using \(?:..\)) can also make the problem harder since the various
different (but largely equivalent) ways to match might need to be
distinguishable via match-data.
But even tho your regexp doesn't use backrefs, and even if you replace
all \(..\) with \(?:..\), your regexp will still cause problems because
our regexp engine does not try to optimize these kinds of cases.
So you have to do it by hand.
>> If the former, how could I rewrite the regexp so that it would not hit
>> these problems?
Maybe something like:
/\*\(<insidecomment>\)*\*+/
where <insidecomment> is something like
[^'*]\|\*+\([^/'*]\|'<afterquote>\)\|'<afterquote>
where <afterquote> is something like
\([^'*]\|\*+[^/'*]\)*'
Tho this will still push a backtrack point for every character.
Maybe better would be something like
/\*[^'*]*\(<insidecomment>\)*\*+/
where <insidecomment> is something like
\(\*+[^/'*]\|\**'<afterquote>\)[^'*]*
where <afterquote> is still something like
\([^'*]\|\*+[^/'*]\)*'
so that we should only push a backtrace point when we see a * or a ' in
the comment.
Stefan